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Abstract. This paper considers fixed effects estimation and inference in linear and nonlin- 
ear panel data models with random coefficients and endogenous regressors. The quantities 
of interest - means, variances, and other moments of the random coefficients - are estimated 
by cross sectional sample moments of GMM estimators applied separately to the time se- 
ries of each individual. To deal with the incidental parameter problem introduced by the 
noise of the within-individual estimators in short panels, we develop bias corrections. These 
corrections are based on higher-order asymptotic expansions of the GMM estimators and 
produce improved point and interval estimates in moderately long panels. Under asymptotic 
sequences where the cross sectional and time series dimensions of the panel pass to infinity 
at the same rate, the uncorrected estimator has an asymptotic bias of the same order as 
the asymptotic variance. The bias corrections remove the bias without increasing variance. 
An empirical example on cigarette demand based on Becker, Grossman and Murphy (1994) 
shows significant heterogeneity in the price effect across U.S. states. 
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1. Introduction 

This paper considers estimation and inference in linear and nonlinear panel data models 
with random coefficients and endogenous regressors. The quantities of interest are means, 
variances, and other moments of the distribution of the random coefficients. In a state level 
panel model of rational addiction, for example, we might be interested in the mean and vari- 
ance of the distribution of the price effect on cigarette consumption across states, controlling 
for endogenous past and future consumptions. These models pose important challenges in 
estimation and inference if the relation between the regressors and individual coefficients is 
left unrestricted. Fixed effects methods based on GMM estimators applied separately to the 
time series of each individual can be severely biased due to the incidental parameter problem. 
The source of the bias is the finite-sample bias of GMM if some of the regressors is endoge- 
nous or the model is nonlinear in parameters, or nonlinearities if the parameter of interest 
is the variance or other high order moment of the individual coefficients. Neglecting the 
heterogeneity and imposing fixed coefficients does not solve the problem, because the result- 
ing estimators are generally inconsistent for the mean of the random coefficients (Yitzhaki, 
1996, and Angrist, Graddy and Imbens, 2000)0 Moreover, imposing fixed coefficients does 
not allow us to estimate other moments of the distribution of the random coefficients. 

We introduce a class of bias-corrected panel fixed effects GMM estimators. Thus, instead 
of imposing fixed coefficients, we estimate different coefficients for each individual using the 
time series observations and correct for the resulting incidental parameter bias. For linear 
models, in addition to the bias correction, these estimators differ from the standard fixed 
effects estimators in that both the intercept and the slopes are different for each individual. 
Moreover, unlike for the classical random coefficient estimators, they do not rely on any 
restriction in the relationship between the regressors and random coefficients; see Hsiao and 
Pesaran (2004) for a recent survey on random coefficient models. This flexibility allows us 
to account for Roy (1951) type selection where the regressors are decision variables with 
levels determined by their returns. Linear models with Roy selection are commonly referred 
to as correlated random coefficient models in the panel data literature. In the presence of 
endogenous regressors, treating the random coefficients as fixed effects is also convenient to 
overcome the identification problems in these models pointed out by Kelejian (1974). 

The most general models we consider are semiparametric in the sense that the distribution 
of the individual coefficients is unspecified and the parameters are identified from moment 
conditions. These conditions can be nonlinear functions in parameters and variables, accom- 
modating both linear and nonlinear random coefficient models, and allowing for the presence 
of time varying endogeneity in the regressors not captured by the individual coefficients. We 

1 Heckman and Vytlacil (2000) and Angrist (2004) find sufficient conditions for fixed coefficient OLS and IV 
estimators to be consistent for the average coefficient. 
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use the moment conditions to estimate the model parameters and other quantities of interest 
via GMM methods applied separately to the time series of each individual. The resulting 
estimates can be severely biased in short panels due to the incidental parameters problem, 
which in this case is a consequence of the finite-sample bias of GMM (Newey and Smith, 
2004) and/or the nonlinearity of the quantities of interest in the individual coefficients. We 
develop analytical corrections to reduce the bias. 

To derive the bias corrections, we use higher-order expansions of the GMM estimators, 
extending the analysis in Newey and Smith (2004) for cross sectional estimators to panel data 
estimators with fixed effects and serial dependence. If n and T denote the cross sectional 
and time series dimensions of the panel, the corrections remove the leading term of the bias 
of order 0(T _1 ), and center the asymptotic distribution at the true parameter value under 
sequences where n and T grow at the same rate. This approach is aimed to perform well in 
econometric applications that use moderately long panels, where the most important part 
of the bias is captured by the first term of the expansion. Other previous studies that used 
a similar approach for the analysis of linear and nonlinear fixed effects estimators in panel 
data include, among others, Kiviet (1995), Phillips and Moon (1999), Alvarez and Arellano 

(2003) , Hahn and Kuersteiner (2002), Lancaster (2002), Woutersen (2002), Hahn and Newey 

(2004) , and Hahn and Kuersteiner (2011). See Arellano and Hahn (2007) for a survey of this 
literature and additional references. 

A first distinctive feature of our corrections is that they can be used in overidentified mod- 
els where the number of moment restrictions is greater than the dimension of the parameter 
vector. This situation is common in economic applications such as rational expectation mod- 
els. Overidentification complicates the analysis by introducing an initial stage for estimating 
optimal weighting matrices to combine the moment conditions, and precludes the use of 
the existing methods. For example, Hahn and Newey's (2004) and Hahn and Kuersteiner's 
(2011) general bias reduction methods for nonlinear panel data models do not cover optimal 
two-step GMM estimators. A second distinctive feature is that our results are specifically 
developed for models with multiple nonadditive heterogeneity, whereas the previous studies 
focused mostly on models with additive heterogeneity captured by an scalar individual effect. 
Exceptions include Arellano and Hahn (2006) and Bester and Hansen (2008), which also con- 
sidered multiple heterogeneity, but they focus on parametric likelihood-based panel models 
with exogenous regressors. Bai (2009) analyzed related linear panel models with exogenous 
regressors and multidimensional interactive individual effects. Bai's nonadditive heterogene- 
ity allows for interaction between individual effects and unobserved factors, whereas the 
nonadditive heterogeneity that we consider allows for interaction between individual effects 
and observed regressors. A third distinctive feature of our analysis is the focus on moments 
of the distribution of the individual effects as one of the main quantities of interest. 



4 



We illustrate the applicability of our methods with empirical and numerical examples 
based on the cigarette demand application of Becker, Grossman and Murphy (1994). Here, 
we estimate a linear rational addictive demand model with state-specific coefficients for price 
and common parameters for the other regressors using a panel data set of U.S. states. We find 
that standard estimators that do not account for non-additive heterogeneity by imposing a 
constant coefficient for price can have important biases for the common parameters, mean of 
the price coefficient and demand elasticities. The analytical bias corrections are effective in 
removing the bias of the estimates of the mean and standard deviation of the price coefficient. 
Figure [1] gives a preview of the empirical results. It plots a normal approximation to the 
distribution of the price effect based on uncorrected and bias corrected estimates of the 
mean and standard deviation of the distribution of the price coefficient. The figure shows 
that there is important heterogeneity in the price effect across states. The bias correction 
reduces by more than 15% the absolute value of the estimate of the mean effect and by 30% 
the estimate of the standard deviation. 

Some of the results for the linear model are related to the recent literature on correlated 
random coefficient panel models with fixed T. Graham and Powell (2008) gave identification 
and estimation results for average effects. Arellano and Bonhomme (2010) studied identi- 
fication of the distributional characteristics of the random coefficients in exogenous linear 
models. None of these papers considered the case where some of the regressors have time 
varying endogeneity not captured by the random coefficients or the model is nonlinear. For 
nonlinear models, Chernozhukov, Fernandez- Val, Hahn and Newey (2010) considered identi- 
fication and estimation of average and quantile treatment effects. Their nonparametric and 
semiparametric bounds do not require large-T, but they do not cover models with continuous 
regressors and time varying endogeneity. 

The rest of the paper is organized as follows. Section [2] illustrates the type of models 
considered and discusses the nature of the bias in two examples. Section [3] introduces the 
general model and fixed effects GMM estimators. Section H] derives the asymptotic properties 
of the estimators. The bias corrections and their asymptotic properties are given in Section 
|5j Section |6] describes the empirical and numerical examples. Section [7] concludes with a 
summary of the main results. Additional numerical examples, proofs and other technical 
details are given in the online supplementary appendix Fernandez- Val and Lee (2012). 

2. Motivating examples 

In this section we describe in detail two simple examples to illustrate the nature of the bias 
problem. The first example is a linear correlated random coefficient model with endogenous 
regressors. We show that averaging IV estimators applied separately to the time series of each 
individual is biased for the mean of the random coefficients because of the finite-sample bias 
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of IV. The second example considers estimation of the variance of the individual coefficients 
in a simple setting without endogeneity. Here the sample variance of the estimators of the 
individual coefficients is biased because of the non-linearity of the variance operator in the 
individual coefficients. The discussion in this section is heuristic leaving to Section H] the 
specification of precise regularity conditions for the validity of the asymptotic expansions 
used. 

2.1. Correlated random coefficient model with endogenous regressors. Consider 
the following panel model: 

(2.1) y it = a 0i + auxu + e u , (i = 1, n; t = 1, T); 

where y it is a response variable, x^ is an observable regressor, is an unobservable error 
term, and i and t usually index individual and time period, respectively^ This is a linear ran- 
dom coefficient model where the effect of the regressor is heterogenous across individuals, but 
no restriction is imposed on the distribution of the individual effect vector aij := (aoi, ecu)'. 
The regressor can be correlated with the error term and a valid instrument (1, Zu) is available 
for (l,x it ), that is E[e it \ = 0, E[z it e it | a*] = and Cov[z it Xi t | aj] ^ 0. An important 
example of this model is the panel version of the treatment-effect model (Wooldridge, 2002 
Chapter 10.2.3, and Angrist and Hahn, 2004). Here, the objective is to evaluate the effect 
of a treatment (D) on an outcome variable (Y). The average causal effect for each level 
of treatment is defined as the difference between the potential outcome that the individual 
would obtain with and without the treatment, Yd — Yq. If individuals can choose the level 
of treatment, potential outcomes and levels of treatment are generally correlated. An in- 
strumental variable Z can be used to identify the causal effect. If potential outcomes are 
represented as the sum of permanent individual components and transitory individual-time 
specific shocks, that is Yj it = Y^ + eju for j E {0, 1}, then we can write this model as a 
special case of (J27TJ) with y it = (1 - D it )Y 0it + D it Y m , a 0i = Y 0i , a u = Y u - Y 0i , x it = D it , 
z it = Z it , and e it = (1 - D it )e it + D it e lit . 

Suppose that we are ultimately interested in a\ := Sfctij], the mean of the random slope 
coefficient. We could neglect the heterogeneity and run fixed effects OLS and IV regressions 
in 

Vu = aoi + OL X Xit + u iu 

where Uu = Xu(au — oti) + in terms of the model (12.11) . In this case, OLS and IV estimate 
weighted means of the individual coefficients in the population; see, for example, Yitzhaki 
(1996) and Angrist and Krueger (1999) for OLS, and Angrist, Graddy and Imbens (2000) 
for IV. OLS puts more weight on individuals with higher variances of the regressor because 

2 More generally, i denotes a group index and t indexes the observations within the group. Examples of 
groups include individuals, states, households, schools, or twins. 
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they give more information about the slope; whereas IV weighs individuals in proportion to 
the variance of the first stage fitted values because these variances reflect the amount of in- 
formation that the individuals convey about the part of the slope affected by the instrument. 
These weighted means are generally different from the mean effect because the weights can 
be correlated with the individual effects. 

To see how these implicit OLS and IV weighting schemes affect the estimand of the fixed- 
coefficient estimators, assume for simplicity that the relationship between Xa and zu is linear, 
that is Xn = TT 0i + Ti\iZ it + Vu, (en, v#) is normal conditional on (z it , cti, 7Tj), z it is independent 



is normal, for 7Tj := (7i"oj, ttu)'. Then, the probability limits of the OLS 



of (ai,7Ti), and (a;,^ 
and IV estimators are 

LS = ofi + {Cov[e it , v it ) + 2E[iT li ]Var[z it ]Cov[a li , TT U ]}/Var[x it ], 

a{ v = ai + Cov[au,Ttu]/E[Ku]- 

These expressions show that the OLS estimand differs from the average coefficient in presence 
of endogeneity, i.e. non zero correlation between the individual-time specific error terms, or 
whenever the individual coefficients are correlated; while the IV estimand differs from the 
average coefficient only in the latter casej^ In the treatment-effects model, there exists 
correlation between the error terms in presence of endogeneity bias and correlation between 
the individual effects arises under Roy-type selection, i.e., when individuals who experience 
a higher permanent effect of the treatment are relatively more prone to accept the offer 
of treatment. Wooldridge (2005) and Murtazashvile and Wooldridge (2005) give sufficient 
conditions for consistency of standard OLS and IV fixed effects estimators. These conditions 
amount to Cov [e it , v it ] = and Cov[x it , oiuloao] = 0. 

Our proposal is to estimate the mean coefficient from separate time series estimators 
for each individual. This strategy consists of running OLS or IV for each individual, and 
then estimating the population moment of interest by the corresponding sample moment 
of the individual estimators. For example, the mean of the random slope coefficient in the 
population is estimated by the sample average of the OLS or IV slopes. These sample 
moments converge to the population moments of interest as number of individuals n and 
time periods T grow. However, since a different coefficient is estimated for each individual, 
the asymptotic distribution of the sample moments can have asymptotic bias due to the 
incidental parameter problem (Neyman and Scott, 1948). 



3 The limit of the IV estimator is obtained from a first stage equation that imposes also fixed coefficients, 
that is xu = 7To; + 7riZu + wu, where Wu — Zu^u — 7Ti) + vu- When the first stage equation is different for 
each individual, the limit of the IV estimator is 

a[ v = ai + 2E[n ll }Cov[a ll ,ir ll }/{E[ir ll } 2 + Var[n lt }}. 

See Theorems 2 and 3 in Angrist and Imbens (1995) for a related discussion. 

4 This feature of the IV estimator is also pointed out in Angrist, Graddy and Imbens (1999), p. 507. 
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To illustrate the nature of this bias, consider the estimator of the mean coefficient oc\ 
constructed from individual time series IV estimators. In this case the incidental parameter 
problem is caused by the finite-sample bias of IV. This can be explained using some expan- 
sions. Thus, assuming independence across t, standard higher-order asymptotics gives (e.g. 
Rilstone et. al., 1996), as T — > oo 

T 

VT t=1 VT 

where %p it = E[z it x it | a^, 7Tj] _1 Zj t ej 4 is the influence function of IV, /?, = —E[zi t xn \ a.i,n,i\~ 2 
E[zf t x it €it I a i; 7Tj] is the higher-order bias of IV (see, e.g., Nagar, 1959, and Buse, 1992), and 
the variables with tilde are in deviation from their individual means, e.g., z% t = Zit ~ E[zu \ 
ai,7Ti]. In the previous expression the first order asymptotic distribution of the individual 
estimator is centered at the truth since a/T(oj{] / — an) — >d iV(0, of) as T — > oo, where 
of = E[z it x it | a i ,'K i \- 2 E[zlel | a^TTj]. 

Let o?i = n~ x X/ILi ®u i the sample average of the IV estimators. The asymptotic distri- 
bution of «i is not centered around a± in short panels or more precisely under asymptotic 
sequences where 77 sfn — > 0. To see this, consider the expansion for Si 

i n 1 n 

■s/n(ai - oti) = —= y^(a!ii - «i) + —j= /X^u ~ a u)- 



i=i v i=i 



The first term is the standard influence function for a sample mean of known elements. The 
second term comes from the estimation of the individual elements inside the sample mean. 
Assuming independence across % and combining the previous expansions, 

1 n 1 1 n ^ yj — 1 n 



i=l v -t v "--t j=1 t=1 



=Op(1) =O p (1/VT) =0(y/K/T) 

This expression shows that the bias term dominates the asymptotic distribution of dt\ in 
short panels under sequences where 77 y/n — >■ 0. Averaging reduces the order of the variance 
of a[Y , without affecting the order of its bias. In this case the estimation of the individual 
coefficients has no first order effect in the asymptotic variance of a% because the second term 
is of smaller order than the first term. 

A potential drawback of the individual by individual time series estimation is that it might 
more be sensitive to weak identification problems than fixed coefficient pooled estimation^ 
In the random coefficient model, for example, we require that E[zi t Xu | 7T»] = TTu ^ with 
probability one, i.e., for all the individuals, whereas fixed coefficient IV only requires that this 



5 We thank a referee for pointing out this issue. 
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condition holds on average, i.e., E[tt u \ ^ 0. The individual estimators are therefore more 
sensitive than traditional pooled estimators to weak instruments problems. On the other 
hand, individual by individual estimation relaxes the exogeneity condition by conditioning on 
additive and non-additive time invariant heterogeneity, i.e, E[z it ti t | ccj,7rj] = 0. Traditional 
fixed effects estimators only condition on additive time invariant heterogeneity. A formal 
treatment of these identification issues is beyond the scope of this paper. 



2.2. Variance of individual coefficients. Consider the panel model: 

y it = an + e it , e it \ a t ~ (0, of), a t ~ (a, of), (t — 1, T;i — 1, n); 

where y it is an outcome variable of interest, which can be decomposed in an individual effect 
«j with mean a and variance a 2 , and an error term e it with zero mean and variance a 2 
conditional on CKj. The parameter of interest is a 2 = Var[ai] and its fixed effects estimator 
is 



3l = (n-l)- 1 Y l (a i -a) 2 




i=i 

where a { = T' 1 Yn=i Va and « = rT x Y2=\ ®i- 

Let <f a . = (ctj — a) 2 — a 2 a and ip e . t = e 2 t — a 2 . Assuming independence across i and t, a 
standard asymptotic expansion gives, as n, T — > oo, 

1 n 1 1 n 1 r 

^(°l-°l) = — Y,V°i + -^-r^Y l Y l V«t+ +o P (l). 



=OU/ri/T) 

=o P (i) =o P (i/VT) 

The first term corresponds to the influence function of the sample variance if the a^s were 
known. The second term comes from the estimation of the a^s. The third term is a bias 
term that comes from the nonlinearity of the variance in Sj. The bias term dominates the 
expansion in short panels under sequences where T/y/n — > 0. As in the previous example, 
the estimation of the ccj's has no first order affect in the asymptotic variance since the second 
term is of smaller order than the first term. 



3. The Model and Estimators 

We consider a general model with a finite number of moment conditions d g . To describe it, 
let the data be denoted by z it (i — 1, . . . , n; t — 1, . . . , T). We assume that Zu is independent 
over % and stationary and strongly mixing over t. Also, let 6 be a ^-vector of common 
parameters, {ai : 1 < i < n} be a sequence of <i a -vectors with the realizations of the 



individual effects, and g(z; 9, ai) be an c? s -vector of functions, where d g > dg + d a u The 
model has true parameters 9q and {«jo : 1 < * < n }, satisfying the moment conditions 

E [g(z it ; 6 , a*))] =0, (t = 1, ...,T;i = 1, ...,n), 

where £"[•] denotes conditional expectation with respect to the distribution of % conditional 
on the individual effects. 

Let E[-} denote the expectation taken with respect to the distribution of the individual 
effects. In the previous model, the ultimate quantities of interest are smooth functions of 
parameters and observations, which in some cases could be the parameters themselves, 

C = EE[(i(z it ; ,a i0 )}, 

if EE\Q(zit] 60, ctio)\ < 00, or moments or other smooth functions of the individual effects 

li = E[fj,(a i0 )), 

if E\fi(aio)\ < 00. In the correlated random coefficient example, g(z it ; 9q, a^) = Zi t (yit—<yoio~ 
&uo x it), & — 0) d>e — 0, d a = 2, and /i(a,o) — «i«o- m the variance of the random coefficients 
example, g(z it ; 9 , a i0 ) = (y it - a 0i0 ), 9 = 0, dg = 0, d a = 1 , and fi(a i0 ) = (a uo - E[a m }) 2 . 

Some more notation, which will be extensively used in the definition of the estimators and 
in the analysis of their asymptotic properties, is the following 

nji(9,ati) ■■= E[g{zi t \9,aii)g{zit-j\9,a.i)'}, J e {0,1,2,...}, 
G di (9, a,) := E[G e {z it - 9, a,)] = E [dg(z it ; 9, o^/dff], 
G az (9,ai) := E[G a (z it ; 9, = E [dg(z it ; 9, a^/da^, 

where superscript ' denotes transpose and higher-order derivatives will be denoted by adding 
subscripts. Here Qji is the covariance matrix between the moment conditions for individual 
i at times t and t—j, and Gg i and G ai are time series average derivatives of these conditions. 
Analogously, for sample moments 

T 

0^(0,0!;) := T _1 ^ 9(ztt'>0><Xi)9(zi,t-j;0,aiiy, j G {0, 1, T - 1}, 
t=j+i 

T T 



G ei (e,on) := T- 1 J2G 9 (zu;9,a t ) = T- 1 J2dg(z lt ;9,a l )/d9', 
t=i t=i 

T T 

G ai (9,ai) := T- 1 J2G a (zti-,9,a t ) = T- 1 J2dg(zu,9,a t )/da' i . 



t=i t=i 



6 We impose that some of the parameters are common for all the individuals to help preserve degrees of 
freedom in estimation of short panels with many regressors. An order condition for this model is that the 
number of individual specific parameters d a has to be less than the time dimension T. 
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In the sequel, the arguments of the expressions will be omitted when the functions are 
evaluated at the true parameter values (0' o ,a' iO )', e.g., g(z it ) means g(zif, 0o, cxio)- 

In cross-section and time series models, parameters defined from moment conditions are 
usually estimated using the two-step GMM estimator of Hansen (1982). To describe how 
to adapt this method to panel models with fixed effects, let ctj) := T _1 J2t=i 9( z iu 0, a*), 
and let (9', {a-}" =1 )' be some preliminary one-step FE-GMM estimator, given by (#', {c^}f =1 )' - 
arginf{( e ' iQ ,^)/ gT }n_ i Ym=i 9i(@i a i)' cc;), where T C ]R d " +(iQ denotes the parameter 

space, and {Wi : 1 < i < n) is a sequence of positive definite symmetric d g x d g weighting 
matrices. The two-step FE-GMM estimator is the solution to the following program 

n 

(0', {a'^U)' = arg inf d^W, <*), 

where <5j) is an estimator of the optimal weighting matrix for individual % 

00 

3=1 

To facilitate the asymptotic analysis, in the estimation of the optimal weighting matrix we 
assume that g(zu] 9q, cxm) is a martingale difference sequence with respect to the sigma algebra 
a(a i0 , Zi <t -i, Zi it -2, ■■■), so that fij = Q 0i and Qi(9,di) = Q 0i (9,di). This assumption holds 
in rational expectation models. We do not impose this assumption to derive the limiting 
distribution of the one-step FE-GMM estimator. 

For the subsequent analysis of the asymptotic properties of the estimator, it is convenient 
to consider the concentrated or profile problem. This problem is a two-step procedure. In 
the first step the program is solved for the individual effects, given the value of the common 
parameter 9. The First Order Conditions (FOC) for this stage, reparametrized conveniently 
as in Newey and Smith (2004), are the following 

% { »M')) — ( 8 -™^l )= ,(i = i,...,„), 

A W " \ 5,(9,5,(9)) + f!,(9,6,)A,(9) ) ' y >' 

where \ is a <i s -vector of individual Lagrange multipliers for the moment conditions, and 
7i := (a^X'j)' is an extended (d a + <i 9 )-vector of individual effects. Then, the solutions to 
the previous equations are plugged into the original problem, leading to the following first 
order conditions for 9, = 0, where 

n n 
i=l i=l 
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is the profile score function for 

Fixed effects estimators of smooth functions of parameters and observations are con- 
structed using the plug-in principle, i.e. £ = where 

n T 
i=l i=l 

Similarly, moments of the individual effects are estimated by ft = fi(6), where 



i=l 



4. Asymptotic Theory for FE-GMM Estimators 

In this section we analyze the properties of one-step and two-step FE-GMM estimators in 
large samples. We show consistency and derive the asymptotic distributions for estimators 
of individual effects, common parameters and other quantities of interest under sequences 
where both n and T pass to infinity with the sample size. We establish results separately 
for one-step and two-step estimators because the former are derived under less restrictive 
assumptions. 

We make the following assumptions to show uniform consistency of the FE-GMM one-step 
estimator: 

Condition 1 (Sampling and asymptotics). (i) For eachi, Zi := {zu : 1 < t < T} is a station- 
ary mixing sequence of random vectors with strong mixing coefficients ai(l) = sup t sup^g^i DeDj +i 
\P(A n D) - P(A)P{D)\, where A\ = cr(a i0 , z it , z^ t _ x , ...) and T>\ = a(a i0 , z it , z ijt +i, ■■■), such 
that supj |a«(7)| < Ca l for some < a < 1 and some C > 0; (ii) {zi : 1 < i < n} are in- 
dependent across i; (Hi) n,T — > oo such that n/T — > k 2 , where < k 2 < oo; and (iv) 
dim [g(-; 9, = d g < oo. 

For a matrix or vector A, let \A\ denote the Euclidean norm, that is |v4| 2 = trace[AA']. 

Condition 2 (Regularity and identification), (i) The vector of moment functions g(-; 8, a) = 
(gi (•; 6, a) , g& (•; 6, a))' is continuous in (6, a) G T; (ii) the parameter space T is a 
compact, convex subset of Mr 0+da ; (Hi) dim (8, a) = dg + d a < d g ; (iv) there exists a 
function M (z it ) such that \g k (z it ; 8, aA] < M(z it ), \dg k (z it ;8,ai) /d(8,ai)\ < M(z it ), for 



'In the original parametrization, the FOC can be written as 

n 

n ^ H Ge< tfi, ai(6))'ni(9, a^-g^e, m{9)) = 0, 

i=l 

where the superscript _ denotes a generalized inverse. 
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k = l,...,d g , and supj E M (zit) 4+S < oo for some 5 > 0; and (v) there exists a deter- 
ministic sequence of symmetric finite positive definite matrices {Wi : 1 < i < n} such that 
su Pi<j<n — — >p 0, and, for each 77 > 



inf 

i 



Qr(0o,a io )- sup Q7{d,a) 

{(0,ay.\(e,a)-(e o ,a io )\>v} 



> 0, 



where 



Q? (0, a t ) := - 9i (9, a,)' W~ 1 g l (9, a,) , 9l (9, a t ) : = E [% (9, a,)} . 



Conditions [TJ^i)-(ii) impose cross sectional independence, but allow for weak time series 
dependence as in Halm and Kuersteiner (2011). Conditions [TJiii)-(iv) describe the asymptotic 
sequences that we consider where T and n grow at the same rate with the sample size, whereas 
the number of moments d g is fixed. Condition [2] adapts standard assumptions of the GMM 
literature to guarantee the identification of the parameters based on time series variation for 
all the individuals, see Newey and McFadden (1994). The dominance and moment conditions 
in^iv) are used to establish uniform consistency of the estimators of the individual effects. 

Theorem 1 (Uniform consistency of one-step estimators). Suppose that Conditions^ and 
[H hold. Then, for any 77 > 

Pr(|0-0 o | >v) =o(T- 1 ), 

where 9 = argmax {(e>cei)eT} ^ i \ YTi=\ Qf{9, «») and Qf (#> «i) : = ~9% (0, a i)' W'% {9, an). 
Also, for any 77 > 

Pr I sup I a, - a i0 | > V J = o (T _1 ) , 

\l<j<n / 

where ai = argmax a Q^(9, a), and Pr |sup 1<i<n A; > 77 1 = o(T _1 ), where \ = —Wf 1 'gi(6, 

Let Ejf := (G' W^Ga.y 1 , := T^G' W~\ P™ := W' 1 - W~ l G a H™ , J% := 
G' s .P^Ge i and '■= E[J^]. We use the following additional assumptions to derive the 
limiting distribution of the one-step estimator: 

Condition 3 (Regularity), (i) For each i, (9q, ai^o) £ int [T]; and (ii) is finite positive 
definite, and {G' a .Wf G a . : 1 < i < n} is a sequence of finite positive definite matrices, 
where {Wi : 1 < i < n} is the sequence of matrices of Condition\B(v). 

Condition 4 (Smoothness), (i) There exists a function M (zu) such that, for k = 1, d g , 
\d dl+d2 g k (za; 9, a t ) /d9 d 'daf I < M (z it ) , < d x + d 2 < 1, . . . , 5, 
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and supj E 



M , s5(d g +d tx +6)/(l-10v)+S 



< oo, /or some <5 > and < v < 1/10; and (ii) 

there exists ^{zu) such thatWi = Wi + Y^t=i£,i( z it)/T + /T, where max i\RY\ = Op(T 1 / 2 ), 
E[ti(zi t )] = 0, and sup^ E[|^(^t)| 20/(1_10, ' )+a ] < oo, for some S > and < v < 1/10. 

Condition [3] is the panel data analog to the standard asymptotic normality condition for 
GMM with cross sectional data, see Newey and McFadden (1994). Condition [4] is similar to 
Condition 4 in Hahn and Kuersteiner (2011), and guarantees the existence of higher order 
expansions for the GMM estimators and the uniform convergence of their remainder terms. 

Let G aai := (G' aail , G' aa .J', where G aaiJ = E[dG ai (z it ) / 'douj], and Gg ai := {G' 6a . v . ..,G 
where Gg aij = E[dG$.(zit) /da^j]. The symbol £g> denotes kronecker product of matrices, 1^ 
a d a x d a identity matrix, Cj a unitary d 9 -vector with 1 in row j, and P^j the j-th column 
of PY ■ Recall that the extended individual effect is 7$ = (a' i: AQ'. 

Lemma 1 (Asymptotic expansion for one-step estimators of individual effects). Under Con- 
ditions^ HI and\Q 

(4.1) Vf(^o - 7,0) = + T-^Ql + T^Rl, 

where 7; := 7i(#o), 



n- 1/2 El^ 4 N(0,E[Vy]), rr 1 Qu * E[B%], B% = B^ + B^ + B^, 

S ^Pl<i<n R Yi = Op(VT), for 



V; 



W _ [ n a t 1 q I ttW pW 

pw ll% V ' °" 



(p>W.I \ ( ttW \ / 00 d = 

B f,i ) = [ p w J ( E EiGMHZo^t-j)] ] , 

<' G - ( B f,a ) = ( F p ] E ^[G Qi (^)'P^^ t -i)], 



j = -oo 



< 15 = (*I£) = (j^JIE^ 



pW 



J2 EfoWP^gizis-j) 



3- 



Theorem 2 (Limit distribution of one-step estimators of common parameters). Under Con- 
ditions^^ □ and\Q 

VrtT(9 - 0„) 4 -( jY r'N (kBY, V s w ) , 
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where 

Jf = E [G' e PZG e ] ,V S W = E [G'P^P^g^ , B w = E [ B W + B w,c + B wy 
and 

B^ B = ~G' 6i + Bl G + B^ S ) , BY = Er=-oo ^W'PjftM]. 

TTie expressions for B^' 1 , B^' , and B^ ,1S are given in Lemma\J\ 

The source of the bias is the non-zero expectation of the profile score of 9 at the true 
parameter value, due to the substitution of the unobserved individual effects by sample es- 
timators. These estimators converge to their true parameter value at a rate VT, which is 
slower than \JnT , the rate of convergence of the estimator of the common parameter, under 
asymptotic sequences where n and T grow with the sample size. Intuitively, only observa- 
tions for each individual convey information about the corresponding individual effect. In 
nonlinear and dynamic models, the slow convergence of the estimator of the individual effect 
introduces bias in the estimators of the rest of parameters. The expression of this bias can 
be explained with an expansion of the score around the true value of the individual effect 

E[3?{e ,%)] = ^[^]+^[^] , ^[7io-7.o]+^K-^K]) , (Tio-T i o)] 

'da+dg 

' /2 + o(T- 1 ) 



+ E 



(^°>i ~ 7ioj)£ [s^/i] (lio ~ 7<o) 

L j=i 

= + B^ B /T + Bf' c /T + BY y /T + o(T- 1 ). 

This expression shows that the bias has the same three components as in the MLE case, see 
Hahn and Newey (2004). The first component, BY' B , comes from the higher-order bias of the 
estimator of the individual effects. The second component, B^ ,c ' , is a correlation term and 
is present because individual effects and common parameters are estimated using the same 
observations. The third component, By , is a variance term. The bias of the individual 
effects, BY' B ', can be further decomposed in three terms corresponding to the asymptotic 
bias for a GMM estimator with the optimal score, B^' 1 , when W is used as the weighting 



8 Using the notation introduced in Section [31 the score is 

n n 

s w (9 Q ) =n~ 1 J2 *f (0 O , 7,o ) = -n-^G^cM'Aio, 

i=l i=l 

where 7^ = is the solution to 

JW/a \ ( G ai (6o,aio)'Xio \ „ 
V gi(Vo,otio) + WiXio J 
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function; the bias arising from estimation of G ai , BY' G ', and the bias arising from not using 
an optimal weighting matrix, B^ ,1S . 

We use the following condition to show the consistency of the two-step FE-GMM estimator: 

Condition 5 (Smoothness, regularity, and martingale), (i) There exists a function M (zu) 
such that \g k (z it ;8,ai)\ < M(z it ), \dg k (z it ; 9,ati) /d (9,ati)\ < M(z it ), fork = l,...,d g , 



and supj E 



Mizit 



AQ(d e +d a +&)/(l-lQv)+S 



< oo, for some 5 > and < v < 1/10; (ii) 
{Qi : 1 < i < n} is a sequence of finite positive definite matrices; and (Hi) for each i, 
g(zi t ; 9q, ojjo) is a martingale difference sequence with respect to cr(ajo, Zi,t-i> z i,t-2, • • •)• 

Conditions [SJi)-(ii) are used to establish the uniform consistency of the estimators of the 
individual weighting matrices. Condition [5](iii) is convenient to simplify the expressions of 
the optimal weighting matrices. It holds, for example, in rational expectation models that 
commonly arise in economic applications. 

Theorem 3 (Uniform consistency of two-step estimators). Suppose that Conditions^ [H, 
and{5\ hold. Then, for any rj > 



Pr 



9-9 



>g)=o (T- 1 ) , 



where 9 = argmax {(e> / )} n =ieT Yh=i Q?(Q> a i) andQf(8,a>i) := -% (9, a*)' Qi(9, a.i) l g i {9 1 a i 
Also, for any rj > 

Pr I sup | — cto\ > rj I — o (T^ 1 ) , 

\l<i<n / 



where = argmaxQ, Qf(9, a), and Pr(sup 1</ „ 
hi(8, «,)Ai = 0. 



A,- 



> r] J = o(T 1 ), where 'g i {9,a i ) + 



We replace Condition H] by the following condition to obtain the limit distribution of the 
two-step estimator: 

Condition 6 (Smoothness). There exists some M (zu) such that, for k = 1, d g 
\d dl+d2 g k (z it ; 9, ai ) /d9 dl da d2 \ < M (z it ) < d x + d 2 < 1, . . . , 5, 



and supj E 



M(z it 



W(d e +d a +6)/(l-10v)+S 



< oo, for some 5 > and < v < 1/10. 



Condition [6] guarantees the existence of higher order expansions for the estimators of the 
weighting matrices and uniform convergence of their remainder terms. Conditions [5] and [6] 
are stronger versions of conditions ID^iv), |5^v) and|H They are presented separately because 
they are only needed when there is a first stage where the weighting matrices are estimated. 

Let E ai := (G' a n^G ai y\ H Qi := X ai G' a pr\ and P ai := fir 1 " ^G^. 
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Lemma 2 (Asymptotic expansion for two-step estimators of individual effects). Under the 
Conditions 0, 0, and\^ 

(4.2) Vf{% - = fa + T-^B^ + T~ 1 R 2 i, 

where % := %(6 ), 



t=i 



n- 1/2 Er=i & ^ ^(0. B 7i = Bl H + + B% + su Pl <,< n R 2l = o P (VT), with, for 

Vi = diag(T, ai ,P ai ) , 



B* t = i b{ ) = [ H p ai ) \^JlG aa ^ a j2 + E[G a Mt)H ai g{z l ^ ] 



< - (<M£:)|m<-< 

Theorem 4 (Limit distribution for two-step estimators of common parameters). Under the 
Conditions and 

v^T(^- 0„) 4 -J^iV ( Kj B s , J,) , 

where J s = E [G' ei P ai G ] , B s = E [B* + B%\ , B* = -G' 6i + + flg + B£] , 5g = 
Y^JLo-E [Gg^Zit)' P ai g{zi. t -j)\. The expressions for B-., B®., B^. and B^ are given in Lemma 



Theorem H] establishes that one iteration of the GMM procedure not only improves as- 
ymptotic efficiency by reducing the variance of the influence function, but also removes the 
variance and non-optimal weighting matrices components from the bias. The higher-order 
bias of the estimator of the individual effects, B^, now has four components, as in Newey and 
Smith (2004). These components correspond to the asymptotic bias for a GMM estimator 
with the optimal score, B-; the bias arising from estimation of G ai , Bf; the bias arising 
from estimation of fij, B^; and the bias arising from the choice of the preliminary first step 
estimator, B^ ■ An additional iteration of the GMM estimator removes the term B^ ■ 

The general procedure for deriving the asymptotic distribution of the FE-GMM estimators 
consists of several expansions. First, we derive higher-order asymptotic expansions for the 
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estimators of the individual effects, with the common parameter fixed at its true value 9 . 
Next, we obtain the asymptotic distribution for the profile score of the common parameter 
at 6*o using the expansions of the estimators of the individual effects. Finally, we derive the 
asymptotic distribution of estimator for the common parameter multiplying the asymptotic 
distribution of the score by the limit profile Jacobian matrix. This procedure is detailed 
in the online appendix Fernandez- Val and Lee (2012). Here we characterize the asymptotic 
bias in a linear correlated random coefficient model with endogenous regressors. Motivated 
by the numerical and empirical examples that follow, we consider a model where only the 
variables with common parameter are endogenous and allow for the moment conditions not 
to be martingale difference sequences. 

Example: Correlated random coefficient model with endogenous regressors. We 

consider a simplified version of the models in the empirical and numerical examples. The 
notation is the same as in the theorems discussed above. The moment condition is 

g(z it ; 9, at) = w it (y it - x' lit Oi - x' 2it 6), 

where Wu = (x'i it ,w 2it )' and Zu = (x' lit , x' 2it , w' 2it , Hit)'- That is, only the regressors with com- 
mon coefficients are endogenous. Let e it = y it — x' lit aio — x' 2it 9 . To simplify the expressions 
for the bias, we assume that e it \ w^OLi ~ i.i.d.(0, of) and E[x2u%t-j \ w^a^ = E[x 2it e i;t -j], 
for Wi = (wn, ...,Wir)' and j G {0, ±1, . . .}. Under these conditions, the optimal weighted 
matrices are proportional to E[w it w' it ], which do not depend on 9 and a i0 . We can therefore 
obtain the optimal GMM estimator in one step using the sample averages T _1 J2t=i w it w 'u 
to estimate the optimal weighting matrices. 

In this model, it is straightforward to see that the estimators of the individual effects have 
no bias, that is B^ 1 = B^ G = B^' 1S = 0. By linearity of the first order conditions in 9 and 
cti, B^' V = 0. The only source of bias is the correlation between the estimators of 9 and ccj. 
After some straightforward but tedious algebra, this bias simplifies to 

oo 
j=-oo 

For the limit Jacobian, we find 

where variables with tilde indicate residuals of population linear projections of the corre- 
sponding variable on Xi it , for example x 2it = x 2it — E[x 2 itx' m }E[xii t x' lit ]~ 1 Xi it . The expression 
of the bias is 

oo 

(4.3) 2(0 O ) = ~{d g - d a ){jf)- l E E[x2it(yi,t-j ~ S^t-M- 

j=-oo 



18 



In random coefficient models the ultimate quantities of interest are often functions of 
the data, model parameters and individual effects. The following corollaries characterize 
the asymptotic distributions of the fixed effects estimators of these quantities. The first 
corollary applies to averages of functions of the data and individual effects such as average 
partial effects and average derivatives in nonlinear models, and average elasticities in linear 
models with variables in levels. Section 6 gives an example of these elasticities. The second 
corollary applies to averages of smooth functions of the individual effects including means, 
variances and other moments of the distribution of these effects. Sections 2 and 6 give 
examples of these functions. We state the results only for estimators constructed from two- 
step estimators of the common parameters and individual effects. Similar results apply to 
estimators constructed from one-step estimators. Both corollaries follow from Lemma |2] and 
Theorem H] by the delta method. 



Corollary 1 (Asymptotic distribution for fixed effects averages). Let £(z; 0, a») be a twice 
continuously differentiable function in its second and third argument, such that infj Var[C(2it)] > 
0, EE[((zit) 2 ] < oo, EE\( a (zit)\ 2 < oo, and EE\(g(zu)\ 2 < oo, where the subscripts on ( 
denote partial derivatives. Then, under the conditions of Theorem^ 



n 



T((-C)^N(kB c ,V ( ), 



where ( = EE [((z it )} 



B c = EE 



d a 



^(aXzitYH^gizi^j) + ( ai (z it y 'B ai +^2( aai> .(z it )"E ai /2 - C,fi{z it )'J s 1 B S 

j=0 j=l 



forB ai = B I ai + B°+B*+BZ, and 



V c = EE 







j=-oo 



Corollary 2 (Asymptotic distribution for smooth functions of individual effects). Let fj,(cti) 
be a twice differentiable function such that E[fi(aio) 2 } < oo and E\fj, a (otio)\ 2 < oo, where the 
subscripts on jj, denote partial derivatives. Then, under the conditions of Theorem^ 



Vn{fi - jj) 4 N{kB^ 



where fi = E [/i(a,o)] , 



d a 



B„ = E fl a .(a i0 )' B a . + ^ Haa id (®io)"£a j / 2 

for B ai = Bl + B? + B% + B%, and V, = E [(^a l0 ) - ^) 2 } . 
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The convergence rates are different in Corollaries 1 and 2. The (nT) -1 / 2 rate in Corollary 
1 follows because the averages are over the cross sectional and time dimensions; whereas 
the n -1//2 rate in Corollary 2 follows because the averages are only over the cross sectional 
dimension. 



5. Bias Corrections 

The FE-GMM estimators of common parameters, while consistent, have bias in the asymp- 
totic distributions under sequences where n and T grow at the same rate. These sequences 
provide a good approximation to the finite sample behavior of the estimators in empirical 
applications where the time dimension is moderately large. The presence of bias invalidates 
any asymptotic inference because the bias is of the same order as the variance. In this section 
we describe bias correction methods to adjust the asymptotic distribution of the FE-GMM 
estimators of the common parameter and smooth functions of the data, model parameters 
and individual effects. All the corrections considered are analytical. Alternative corrections 
based on variations of Jackknife can be implemented using the approaches described in Hahn 
and Newey (2004) and Dhaene and Jochmans (2010)fl 

We consider three analytical methods that differ in whether the bias is corrected from the 
estimator or from the first order conditions, and in whether the correction is one-step or 
iterated for methods that correct the bias from the estimator. All these methods reduce the 
order of the asymptotic bias without increasing the asymptotic variance. They are based on 
analytical estimators of the bias of the profile score B s and the profile Jacobian matrix J s . 
Since these quantities include cross sectional and time series means E and E evaluated at the 
true parameter values for the common parameter and individual effects, they are estimated 
by the corresponding cross sectional and time series averages evaluated at the FE-GMM 
estimates. Thus, for any function of the data, common parameter and individual effects 
f it (9, a,), let U(9) = f it (9, «,(#)), W) = E[fit{e)] = T' 1 £ t T =1 f u (6) and f{6) = E[f t (9)} = 
n-'EtJM Next, define % ai {B) = [G^'fir 1 ^)]- 1 , H^O) = £^(0)^(0)'^, 
and P ai (9) = Q,J l G ai {6)H a .(6). To simplify the presentation, we only give explicit formulas 
for FE-GMM three-step estimators in the main text. We give the expressions for one and 
two-step estimators in the Supplementary Appendix. Let 

3(0) = -J 8 {ey x B s {d), B s (6) = E[B*{6) + B°(9)] : J s {d) = E [G^ByP^G^ff)], 



Hahn, Kuersteiner and Newey (2004) show that analytical, Bootstrap, and Jackknife bias corrections meth- 
ods are asymptotically equivalent up to third order for MLE. We conjecture that the same result applies to 
GMM estimators, but the proof is beyond the scope of this paper. 
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where B B (9) = -G^'fiitf) + + + S Wl> 

d a IT 

BiM = -p CH {e)^d CUHd (0)% Xi (e)/2 + P at (9)^T- 1 G ait {e)H ai {e)9i,t-o{e), 

3=1 3=0 t=j+l 

oo T 

BZ(0) = H a M'J2 T ^ E G au (d)'P*M9i,t-3(0), 

3=0 t= 3 + l 

I T 

3=0 t=j+l 

and B^iO) = T _1 Ylj=o 12t=j+i Ge it (0)' 'P ai (0)fo,t-j(9) ■ In the previous expressions, the spec- 
tral time series averages that involve an infinite number of terms are trimmed. The trimming 
parameter £ is a positive bandwidth that need to be chosen such that £ — » oo and £/T — > 
as T — > oo (Hahn and Kuersteiner, 2011) 

The one-step correction of the estimator subtracts an estimator of the expression of the 
asymptotic bias from the estimator of the common parameter. Using the expressions defined 
above evaluated at 6, the bias-corrected estimator is 

(5.1) BC = 6-%{6)/T. 

This bias correction is straightforward to implement because it only requires one optimiza- 
tion. The iterated correction is equivalent to solving the nonlinear equation 

(5.2) 6 IBC = 6-%(6 IBC )/T. 

When 9 + H>(6) is invertible in 9, it is possible to obtain a closed-form solution to the previous 
equation^ Otherwise, an iterative procedure is needed. The score bias-corrected estimator 
is the solution to the following estimating equation 

(5.3) s(6 SBC )-B s (6 SBC )/T = 0. 

This procedure, while computationally more intensive, has the attractive feature that both 
estimator and bias are obtained simultaneously. Hahn and Newey (2004) show that fully 
iterated bias-corrected estimators solve approximated bias-corrected first order conditions. 
IBC and SBC are equivalent if the first order conditions are linear in 8. 

Example: Correlated random coefficient model with endogenous regressors. The 

previous methods can be illustrated in the correlated random coefficient model example in 
Section 4. Here, the fixed effects GMM estimators have closed forms: 

t=i J t=i 



'See MacKinnon and Smith (1998) for a comparison of one-step and iterated bias correction methods. 
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i=l 



/] x 2itW' 2 it 



and 

T 

t=i \t=i / t=i 

where = YZ=i\52Z=i %2it™2it(Y%=i ^tw' 2i ^~ x J2t=i ^2itX 2it ], and variables with tilde now 
indicate residuals of sample linear projections of the corresponding variable on Xm, for 
example x 2it = x 2it - Y^=i x 2itx' ut {Y^ = i x iitx' ut )~ l x 1 it. 

We can estimate the bias of 9 from the analytic formula in expression (14.31) replacing 
population by sample moments and 9q by 9, and trimming the number of terms in the 
spectral expectation, 

n £ min(r,T+i) 

s(0) = -(d g - d a ){jYy i Y,Y. E ^t(ht-o - /)■ 

i=l j=—£ i=max(l,j+l) 

The one-step bias corrected estimates of the common parameter 9 and the average of the 
individual parameter a := E[otj\ are 



9 BC = 9-T>(9)/T, 



a BC = n 



■'E 

i=l 



lBC\ 



The iterated bias correction estimator can be derived analytically by solving 

qIBC = Q-%0 IBC )/T, 

which has closed-form solution 



„ e min(T,T+j) 
i=l i=-f t=max(l j'+l) 



n £ min(T,T+j) 

+ (dg ~ d a )(jf ^ £ £ £ i 2lt ^ t _,/(nT 2 ) 

i=l j = -l t=max(l j'+l) 

The score bias correction is the same as the iterated correction because the first order con- 
ditions are linear in 9. 

The bias correction methods described above yield normal asymptotic distributions cen- 
tered at the true parameter value for panels where n and T grow at the same rate with 
the sample size. This result is formally stated in Theorem which establishes that all the 
methods are asymptotically equivalent, up to first order. 



Theorem 5 (Limit distribution of bias-corrected FE-GMM). Assume that y/ nT(B s (9) — 
B s )/T A and VrtT(J s (9)- J.)/T A 0, for some 9 = 9 + O P ((nT)- 1 / 2 ). Under Conditions 
d [1 El and% for C G {BC, SBC, IBC} 

^T(9 c -9 )An(o,j; 1 ), 



(5.4) 
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where 9 BC , 9 IBC and 9 SBC are defined in ( TP]) . / TO) and (ED, and J s = E[G' di P ai G 0l ]. 

The convergence condition for the estimators of B s and J s holds for sample analogs eval- 
uated at the initial FE-GMM one-step or two-step estimators if the trimming sequence is 
chosen such that i — > oo and £/T — > as T — > oo. Theorem [5] also shows that all the bias- 
corrected estimators considered are first-order asymptotically efficient, since their variances 
achieve the semiparametric efficiency bound for the common parameters in this model, see 
Chamberlain (1992). 

The following corollaries give bias corrected estimators for averages of the data and indi- 
vidual effects and for moments of the individual effects, together with the limit distributions 
of these estimators and consistent estimators of their asymptotic variances. To construct 
the corrections, we use bias corrected estimators of the common parameter. The corollaries 
then follow from Lemma |2] and Theorem [5] by the delta method. We use the same notation 
as in the estimation of the bias of the common parameters above to denote the estimators 
of the components of the bias and variance. 

Corollary 3 (Bias correction for fixed effects averages). Let ((z;9,ai) be a twice continu- 
ously differentiable function in its second and third argument, such that inf i Var[((zu)] > 0, 
EE[((z it ) 2 } < oo, EE[( a (z it ) 2 } < oo, and EE\( e (z it )\ 2 < oo. For C G {BC, SBC, IBC}, let 
(C = f(0C) _ B ( (9 C )/T where 

- e T ^ d a 

J=0 t=j+l j=l 



B c (9) = E 



where £ is a positive bandwidth such that I — > oo and E/T — > as T — > oo. Then, under the 
conditions of Theorem^ 

x/nT(C c -C)4iV(0,^), 

where ( and are defined in Corollary {J\ Also, for any 9 = 9q + Op^nT)^ 1 ^ 2 ) and 
( = ( + P ((nT)-V 2 ), 

i 

is a consistent estimator for . 



T 

3=0 t=j + l 



Corollary 4 (Bias correction for smooth functions of individual effects). Let /i(«j) be a 
twice differentiable function such that E[n(aio) 2 ] < oo and E\n a (aio)\ 2 < oo. For C G 
{BC, SBC, IBC}, let j2 c = E[j2i(9 c )} - B^(9 C )/T, where fo(0) = jx(ai(0)), and B„(9) = 
E\jl ai {9)' B ai {9) + Yl'jZi'P'aa, j (9yT lat (9) /2]. Then, under the conditions of Theorem^ 

v^(/i c -/i)4iV(o,y M ), 



2:! 



where /x = E \jJ,(otio)] and = E [(/x(a i0 ) ~~ f 1 ) 2 ] ■ Also, for any 9 = 9 + Op((nT) 1 ^ 2 ) and 
fi = fi + P (n- 1 / 2 ), 



(5.5) 



is a consistent estimator for V^. The second term in 115. 5\) is included to improve the finite 
sample properties of the estimator in short panels. 



6. Empirical example 

We illustrate the new estimators with an empirical example based on the classical cigarette 
demand study of Becker, Grossman and Murphy (1994) (BGM hereafter). Cigarettes are ad- 
dictive goods. To account for this addictive nature, early cigarette demand studies included 
lagged consumption as explanatory variables (e.g., Baltagi and Levin, 1986). This approach, 
however, ignores that rational or forward-looking consumers take into account the effect of 
today's consumption decision on future consumption decisions. Becker and Murphy (1988) 
developed a model of rational addiction where expected changes in future prices affect the 
current consumption. BGM empirically tested this model using a linear structural demand 
function based on quadratic utility assumptions. The demand function includes both future 
and past consumptions as determinants of current demand, and the future price affects the 
current demand only through the future consumption. They found that the effect of future 
consumption on current consumption is significant, what they took as evidence in favor of 
the rational model. 

Most of the empirical studies in this literature use yearly state-level panel data sets. They 
include fixed effects to control for additive heterogeneity at the state-level and use leads and 
lags of cigarette prices and taxes as instruments for leads and lags of consumption. These 
studies, however, do not consider possible non-additive heterogeneity in price elasticities or 
sensitivities across states. There are multiple reasons why there may be heterogeneity in the 
price effects across states correlated with the price level. First, the considerable differences 
in income, industrial, ethnic and religious composition at inter-state level can translate into 
different tastes and policies toward cigarettes. Second, from the perspective of the theoretical 
model developed by Becker and Murphy (1988), the price effect is a function of the marginal 
utility of wealth that varies across states and depends on cigarette prices. If the price 
effect is heterogenous and correlated with the price level, a fixed coefficient specification 
may produce substantial bias in estimating the average elasticity of cigarette consumption 
because the between variation of price is much larger than the within variation. Wangen 
(2004) gives additional theoretical reasons against a fixed coefficient specification for the 
demand function in this application. 
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We consider the following linear specification for the demand function 



(6.1) 



a 



\t — otQi + ct\iPit + #iCj )t _i + 9 2 Ci it +i + X' it 5 + e it , 



where C# is cigarette consumption in state i at time t measured by per capita sales in packs; 
aoi is an additive state effect; an is a state specific price coefficient; Pu is the price in 1982- 
1984 dollars; and Xu is a vector of covariates which includes income, various measures of 
incentive for smuggling across states, and year dummies. We estimate the model parameters 
using OLS and IV methods with both fixed coefficient for price and random coefficient for 
price. The data set, consisting of an unbalanced panel of 51 U.S. states over the years 1957 
to 1994, is the same as in Fenn, Antonovitz and Schroeter (2001). The set of instruments for 
Cij-i and C^t+i in the IV estimators is the same as in specification 3 of BGM and includes 
Xn, Pu, Pi,t-ii Pi,t+i, Tax ^, Taxij-i, and Taxi^+i, where Taxu is the state excise tax for 
cigarettes in 1982-1984 dollars. 

Table 1 reports estimates of coefficients and demand elasticities. We focus on the coeffi- 
cients of the key variables, namely Pu, C^t-i and Ci^+i- Throughout the table, FC refers 
to the fixed coefficient specification with an = a% and RC refers to the random coefficient 
specification in equation ( 16. ip . BC and IBC refer to estimates after bias correction and iter- 
ated bias correction, respectively. Demand elasticities are calculated using the expressions in 
Appendix A of BGM. They are functions of Cu,Pit, olu, Q\ and 62, linear in an. For random 
coefficient estimators, we report the mean of individual elasticities, i.e. 



where Ch{ z it] — 9 log CWm /d log Path) are price elasticities at different time horizons 

h. Standard errors for the elasticities are obtained by the delta method as described in 
Corollaries [3] and HJ For bias-corrected RC estimators the standard errors use bias-corrected 
estimates of 6 and a.i. 

As BGM, we find that OLS estimates substantially differ from their IV counterparts. 
IV-FC underestimates the elasticities relative to IV-RC. For example, the long-run elastic- 
ity estimate is —0.70 with IV-FC, whereas it is —0.88 with IV-RC. This difference is also 
pronounced for short-run elasticities, where the IV-RC estimates are more than 25 percent 
larger than the IV-FC estimates. We observe the same pattern throughout the table for 
every elasticity. The bias comes from both the estimation of the common parameter 62 
and the mean of the individual specific parameter j5[ait]. The bias corrections increase the 
coefficient of future consumption C^t+i and reduce the absolute value of the mean of the 
price coefficient. Moreover, they have significant impact on the estimator of dispersion of 
the price coefficient. The uncorrected estimates of the standard deviation are more than 




i=l t=l 
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20% larger than the bias corrected counterparts. In the online appendix Fernandez- Val and 
Lee (2012), we show through a Monte-Carlo experiment calibrated to this empirical example, 
that the bias is generally large for dispersion parameters and the bias corrections are effective 
in reducing this bias. As a consequence of shrinking the estimates of the dispersion of an, 
we obtain smaller standard errors for the estimates of -E[o;ii] throughout the table. In the 
Monte-Carlo experiment, we also find that this correction in the standard errors provides 
improved inference. 

7. Conclusion 

This paper introduces a new class of fixed effects GMM estimators for panel data mod- 
els with unrestricted nonadditive heterogeneity and endogenous regressors. Bias correction 
methods are developed because these estimators suffer from the incidental parameters prob- 
lem. Other estimators based on moment conditions, like the class of GEL estimators, can be 
analyzed using a similar methodology. An attractive alternative framework for estimation 
and inference in random coefficient models is a flexible Bayesian approach. It would be inter- 
esting to explore whether there are connections between moments of posterior distributions 
in the Bayesian approach and the fixed effects estimators considered in the paper. Another 
interesting extension would be to find bias reducing priors in the GMM framework similar 
to the ones characterized by Arellano and Bonhomme (2009) in the MLE framework. We 
leave these extensions to future research. 
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Table 1: Estimates of Rational Addiction Model for Cigarette Demand 





OLS-FC 


IV-FC 




OLS-RC 






IV-RC 










NBC 


BC 


IBC 


NBC 


BC 


IBC 


Coefficients 


















(Mean) P t 


-9.58 
(1.86) 


-34.10 
(4.10) 


-13.49 
(3.55) 


-13.58 
(3.55) 


-13.26 
(3.55) 


-36.39 
(4.85) 


-31.26 
(4.62) 


-31.26 
(4.64) 


(Std. Dev.) P t 






4.35 
(0.98) 


4.22 
(1.02) 


4.07 
(1.03) 


12.86 
(2.35) 


10.45 
(2.13) 


10.60 
(2.15) 


C t -i 


0.49 
(0.01) 


0.45 
(0.06) 


0.48 
(0.04) 


0.48 
(0.04) 


0.48 
(0.04) 


0.44 
(0.04) 


0.44 
(0.04) 


0.45 
(0.04) 


Ct+i 


0.44 
(0.01) 


0.17 
(0.07) 


0.44 
(0.04) 


0.43 
(0.04) 


0.44 
(0.04) 


0.23 
(0.05) 


0.29 
(0.05) 


0.27 
(0.05) 


Price elasticities 


















Long-run 


-1.05 
(0.24) 


-0.70 
(0.12) 


-1.30 
(0.28) 


-1.31 
(0.28) 


-1.28 
(0.28) 


-0.88 
(0.09) 


-0.91 
(0.10) 


-0.90 
(0.10) 


Own Price 
(Anticipated) 


-0.20 
(0.04) 


-0.32 
(0.04) 


-0.27 
(0.06) 


-0.27 
(0.06) 


-0.27 
(0.06) 


-0.38 
(0.04) 


-0.35 
(0.04) 


-0.35 
(0.04) 


Own Price 
(Unanticipated) 

V Sr ; 


-0.11 
(0.02) 


-0.29 
(0.03) 


-0.15 
(0.04) 


-0.16 
(0.04) 


-0.15 
(0.04) 


-0.33 
(0.04) 


-0.29 
(0.04) 


-0.29 
(0.04) 


Future Price 
(Unanticipated) 


-0.07 
(0.01) 


-0.05 
(0.03) 


-0.10 
(0.02) 


-0.10 
(0.02) 


-0.09 
(0.02) 


-0.09 
(0.02) 


-0.10 
(0.02) 


-0.09 
(0.02) 


Past Price 
(Unanticipated) 


-0.08 
(0.01) 


-0.14 
(0.02) 


-0.11 

(0.03) 


-0.11 
(0.02) 


-0.10 
(0.03) 


-0.16 
(0.02) 


-0.15 
(0.02) 


-0.15 
(0.02) 


Short-Run 


-0.30 
(0.05) 


-0.35 
(0.06) 


-0.41 
(0.12) 


-0.41 
(0.12) 


-0.40 
(0.12) 


-0.44 
(0.06) 


-0.44 
(0.06) 


-0.43 
(0.06) 



RC/FC refers to random/fixed coefficient model. NBC/BC/IBC refers to no bias-correction/bias 
correction/iterated bias correction estimates. 
Note: Standard errors are in parenthesis. 
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Price effect 



FIGURE 1. Normal approximation to the distribution of price effects using 
uncorrected (solid line) and bias corrected (dashed line) estimates of the mean 
and standard deviation of the distribution of price effects. Uncorrected es- 
timates of the mean and standard deviation are -36 and 13, bias corrected 
estimates are -31 and 10. 
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This supplement to the paper "Panel Data Models with Nonadditive Unobserved Heterogeneity: Estima- 
tion and Inference" provides additional numerical examples and the proofs of the main results. The appendix 
is organized in seven sections. The first section contains a Monte Carlo simulation calibrated to the empirical 
example of the paper. The second section gives the proofs of the consistency of the one-step and two-step 
FE-GMM estimators. The third section includes the derivations of the asymptotic distribution of one-step 
and two-step FE-GMM estimators. The forth section provides the derivations of the asymptotic distribu- 
tion of bias corrected FE-GMM estimators. The fifth and sixth sections contain the characterization of the 
stochastic expansions for the estimators of the individual effects and the scores. The last section includes 
the expressions for the scores and their derivatives. 

Throughout the appendices O u p and o u p will denote uniform orders in probability. For example, for a 
sequence of random variables {£j : 1 < i < n}, £j = O u p(l) means sup 1<i<n ^ = Op(l), and £j = o u p(l) 
means sup 1<i<n ^ = op(l). It can be shown that the usual algebraic properties for Op and op orders also 
apply to the uniform orders O u p and o u p. Let ej denote a 1 x d g unitary vector with a one in position j. 
For a matrix A, \A\ denotes Euclidean norm, that is \A\ 2 = trace[AA']. HK refers to Hahn and Kuersteiner 
(2011). 

Appendix A. Numerical example 

We design a Monte Carlo experiment to closely match the cigarette demand empirical example in the 
paper. In particular, we consider the following linear model with common and individual specific parameters: 

C it = aw + ctiiPit + 9iC i>t -i + 02Ci >t +i + ipe it , 

Pit = rjoi + VuTaxit + u it , (i = 1, 2, . . . , n, t = 1, 2, . . . , T); 

where {(oiji,T]ji) : 1 < i < n} is i.i.d. bivariate normal with mean (pj, variances (<7?,<7jj), and 

correlation pj, for j e {0,1}, independent across j; {uu : 1 < t < T, 1 < i < n} is i.i.d A(0,<7^); and 
{e it : 1 < t < T, 1 < i < n} is i.i.d. standard normal. We fix the values of Taxu to the values in the data 
set. All the parameters other than pi and tp are calibrated to the data set. Since the panel is balanced for 
only 1972 to 1994, we set T — 23 and generate balanced panels for the simulations. Specifically, we consider 

n = 51, T = 23; p = 72.86, p 1 = -31.26, p Va = 0.81, p m = 0.13, a = 18.54, a 1 = 10.60, a Vo = 0.14, 
a m = 2.05, a u = 0.15 , 9 1 = 0.45 , 9 2 = 0.27, p Q = -0.17, p 1 € {0, 0.3, 0.6, 0.9}, ip € {2, 4, 6}. 

In the empirical example, the estimated values of p\ and tp are close to 0.3 and 5, respectively. 

Since the model is dynamic with leads and lags of the dependent variable on the right hand side, we 
construct the series of Ca by solving the difference equation following BGM. The stationary part of the 
solution is 

^ oo ^ oo 

Ctt = tttu, — XT E # M* + s ) + Tnrjl — TT E ^ s hi(t - s) 
9\(p\{(p2 - <pi) jr[ Oifcifa - (pi) 



2 



where 



hi(t) = a 0l + auPij-i + i>ti,t-i, 4>i = 



1 - (1 - 46' 1 6> 2 ) 1 / 2 
20i 



i + ji-Ae^) 1 ' 2 

26»i 



In our specification, these values are (pi = 0.31 and </> 2 = 1.91. The parameters that we vary across the 
experiments are pi and tp. The parameter p\ controls the degree of correlation between an and Pn and 
determines the bias caused by using fixed coefficient estimators. The parameter tp controls the degree of 
endogeneity in C^t-i and Cj^+i, which determines the bias of OLS and the incidental parameter bias of 
random coefficient IV estimators. Although tp is not an ideal experimental parameter because it is the 
variance of the error, it is the only free parameter that affects the endogeneity of Cij-i and C^t+i- In this 
design we cannot fully remove the endogeneity of C, ;t _i and C^t+i because of the dynamics. 

In each simulation, we estimate the parameters with standard fixed coefficient OLS and IV with additive 
individual effects (FC) , and the FE-GMM OLS and IV estimators with the individual specific coefficients 
(RC). For IV, we use the same set of instruments as in the empirical example. We report results only for 
the common coefficient 02, and the mean and standard deviation of the individual-specific coefficient an- 
Throughout the tables, Bias refers to the mean of the bias across simulations; SD refers to the standard 
deviation of the estimates; SE/SD denotes the ratio of the average standard error to the standard deviation; 
and p; .05 is the rejection frequency of a two-sided test with nominal level of 0.05 that the parameter is equal 
to its true value. For bias-corrected RC estimators the standard errors are calculated using bias corrected 
estimates of the common parameter and individual effects. 

Table Al reports the results for the estimators of 8 2 . We find significant biases in all the OLS estimators 
and in the IV-FC estimators, relative to the standard deviations of these estimators. The bias of OLS 
depends only on tp, whereas the bias of IV- RC depends only on p\. When pi — 0, there is no correlation 
between an and P itl and IV-FC estimates are consistent. As p\ increases the bias in IV-FC grows. IV- RC 
estimators have no bias in every configuration and their tests display much smaller size distortions than for 
the other estimators. The bias corrections preserve the bias and inference properties of the RC-IV estimator. 

Table A2 reports similar results for the estimators of the mean of the individual specific coefficient p\ = 
E[a\i\. We find substantial biases for OLS and IV-FC estimators. RC-IV displays some bias, which is 
removed by the corrections in some configurations. The bias corrections provide significant improvements 
in the estimation of standard errors. IV-RC standard errors overestimate the dispersion by more than 15% 
when tp is greater or equal than 4, whereas IV-BC or IV-IBC estimators have SE/SD ratios close to 1. 
As a result bias corrected estimators show smaller size distortions. This improvement comes from the bias 
correction in the estimates of the dispersion of an that we use to construct the standard errors. The bias of 
the estimator of the dispersion is generally large, and is effectively removed by the correction. We can see 
more evidence on this phenomenon in Table A3. 

Table A3 shows the results for the estimators of the standard deviation of the individual specific coefficient 
o\ = E[(an — Pi) 2 } 1 ^ 2 - As noted above, the bias corrections are relevant in this case. As tp increases, the 
bias grows in orders of tp 2 . Most of bias is removed by the correction even when tp is large. For example, 
when tp = 6, the bias of IV-RC estimator is about 4 which is larger than two times its standard deviation. 
The correction reduces the bias to about 0.5, which is small relative to the standard deviation. Moreover, 
despite the overestimation in the standard errors, there are important size distortions for IV-RC estimators 
for tests on o\ when tp is large. The bias corrections bring the rejection frequencies close to their nominal 
levels. 
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Overall, the calibrated Monte-Carlo experiment confirms that the IV-RC estimator with bias correction 
provides improved estimation and inference for all the parameters of interest for the model considered in the 
empirical example. 

Appendix B. Consistency of One-Step and Two-Step FE-GMM Estimator 
Lemma 3. Suppose that the Conditions^ and\^ hold. Then, for every rj > 



Pr < sup sup 
I l<i<n (e.a)er 



QY(d,a)-Qf{6,a) > r?j = o^ 1 ), 
sup \Qf{9,a) - Qf{e',a)\ < C ■ E[M(z it )] 2 \0-9'\ 

a 

for some constant C > 0. 
Proof. First, note that 

Q?(9,a)-QF(0,a)\ < \ 9i {9,a)'W^ 9i {e,a) - 9i {9,a)'Wr x 9i {9,a)\ + \ 9i {9,a)'{Wr 1 -Wr l ) 9i {9,a) 

< | a) - 9i (9, a)]'Wr%(9, a) - 9i (9, a)]\+2- \ 9l {0, a)'^" 1 [ 9i (9, a) - 9i (9, a)} | 
+ \[g i (9,a)-g i (6,a)} , (Wr 1 - W^)W, a) - 9i (9, a)} | + 2 | [ 9i {9, a) - 9l {0, a)]'^ 1 - Wr l ) 9i {9,a) 
+ gifraYOVr 1 -Wr^g^a) <dl max \§ k>i (9, a) - g k<i {6 : a)\ 2 \ W^ 1 

l<k<d g 

+ 2d 2 sup E[M(z it )] \Wi\~ 1 max \ 9 k,i(9,a) - g k ,i(9,a)\ + o P [ max \g k i (9,a)-gk,i(9,a)\), 

l<i<n l<k<d g \l<k<d g J 

where we use that sup 1<i<n \W% — Wi\ — op(l). Then, by Condition [21 we can apply Lemma 4 of HK to 
\gk,i{9, a) — gk,i(9, &)\ to obtain the first part. 

The second part follows from 

\QY{0, a) - QY(0', a)\ < \g t (0, a)'Wr l [ 9i (9, a) - 9i (9', a)}\+\ [ 9l (0, a) - 9i (9', a)]'Wr l 9i (9> a) I 



< 2-d 2 g E[M{z it )} 2 \W i \- 1 \9-9 , \ 



□ 



B.l. Proof of Theorem 1. 



Proof. Part I: Consistency of 0. For any 77 > 0, let e := Mi[QY(9o, atia)-8VV{ie, a )-.\(6, a )-(p ,a i0 )\>r,} Qf(0,a)} > 
as defined in Condition[2] Using the standard argument for consistency of extremum estimator, as in Newey 
and McFadden (1994), with probability 1 - o(T" 1 ) 



max n 

|0— 8o|>>7, Cti,..., ot n 



n n -. 



i=l i=l 



by definition of e and Lemma [3] Thus, by continuity of QY and the definition of the lefthand side above, 
we conclude that Pr 



> 



= T 



Part II: Consistency of on. By Part I and Lemma G2 
(B.l) Pr 



sup sup 


QY( 


9,aj-QY(9 ,a) 




_l<i<n a 









oiT- 1 ) 



4 



for any r\ > 0. Let 



Condition on the event 



e := inf 



)Y (00, Oi io ) - sup QY (0o , «t 

{a;:|ai-aiol>»i} 



> 0. 



sup sup 

Ki<n a 



QY(e,a) -QY(0o,a) 
which has a probability equal to 1 — o (T^ 1 ) by (|B.1[) . Then 



max ( (9, a 4 ) < max Qf (0 O , aj) + -^e < Qf (0o, a,o) - < 

|aj— Qiol>') ^ ' I";- c«iol>'? o o 



| ct-f — aio|>?? ° ^ ^ \<Xi — a i0 \>7] " 3 3 v J 3 

This is inconsistent with ctij > QY otujj , and therefore, \&i — ctio\ < r\ with probability 1 — o(T _1 ) 

for every i. 

Part III: Consistency of A;. First, note that 



A; 



_1 ( 

max 

Kk<d„ \ 



< d n 



max sup \gk,i(Q,oti) - gk,i(0,ai)\ 



Kk<d 



s (e,Qi)GT 



M{z it ) 



M(zit) \&i - a i0 \ 



Then, the result follows because sup 1<i<n \Wi — Wi\ = op(l) and {Wi : 1 < i < n} are positive definite 
by Conditional maxKK^ sup (e Qi)eX \gk,i(9, on) - g k .i{9,oti)\ = o P (l) by Lemma 4 in HK, and 9 - 9 = 
op(l) and sup 1<i<n \&i — o>io\ = op(l) by Parts I and II. □ 



B.2. Proof of Theorem 3. 

Proof. First, assume that Conditions [TJ [21 El and [S] hold. The proofs are exactly the same as that of Theorem 
[1] using the uniform convergence of the criterion function. 

To establish the uniform convergence of the criterion function as in Lemma [3l we need 



sup 

Ki<n 



fti(0,ai) - Slj(6»o,a i0 ) = o P (l), 



along with an extended version of the continuous mapping theorem for o u p. This can be shown by noting 
that 

?ii(e,a i )-vt i {e ,a i Q) < h l (e,a l )-n l (9,a l ) + 0,(0, a*) -n^^o) 



< 



hi(fi,ai)-Qi(0,ai) +d 2 g E[M(z tt ) 2 ] (0, on) - (6 , a i0 ) 



The convergence follows by the consistency of 9 and 6Vs, and the application of Lemma 2 of HK to 

g k (z it ;9,ai)gi(z. lt ;9,ai) using that \g k (z. lt ;9,ai)gi(z it ;9,ai)\ < M(z it ) 2 . □ 

Appendix C. Asymptotic Distribution of One-step and Two-step FE-GMM Estimator 



C.l. Some Lemmas. 

Lemma 4. Assume that Condition^ holds. Let h(zit]9,ai) be a function such that (i) h(zi t ;9,ai) is 
continuously differ entiable in (9,cti) € T C K 9+ (ii) T is convex; (Hi) there exists a function M (zn) such 
that \h{z u ;d,ai)\ < M(z it ) and \dh(z l f 1 9,a l )/d(9,a i )\ < M(z it ) with E [M{z lt f^+ d ^l^ 1 ^+ 5 ] < oo 
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for some S > and < v < 1/10. Define Hi(9, cti) := T 1 h{ z U\ 0, &i)> o,nd Hi(9, cti) := £ 

Lei 

a* = arg max QY (9*, a^), 

such that a* — c^o = o uP (T aa ) and 9* — 9q = op(T ae ), with —2/5 < a < 0, for a — max(a a ,ag). Then, for 
any 9 between 9* and 9q, and o>i between a* and ctio, 

Vf[Hi(6,ai) - Hi(S,ai)] = o u p(T 1/10 ), Hi(§,ai) - #i(0 O ,a iO ) = o uP (T a ). 



H % {9, ai ) . 



Proof. The first statement follows from Lemma 2 in HK. The second statement follows by the first statement 
and the conditions of the Lemma by a mean value expansion since 



Hi(9,ai) - Hi(9 ,a i0 ) 



< 



=o uP (T»)s_ 



\oti - Qtio 



=O uP (l) =O uP (l) 

fli (0 O , « iQ ) - fli (0 O , am) = o uP (T a ). 



=o uP (t-vs) 



□ 



Lemma 5. Assume that Conditions [7J [H and[7] hold. Let (9 , ji) denote the first stage GMM score of 
the fixed effects, that is 

Wia „ \ ( Ga t (9,a t )'X l \ 
y g l {9,a l ) + WiXi J 

where ji — (aj, A^)', s^(0,7i) denote the one-step GMM score for the common parameter, that is 

sf{9, lt ) = -G 0l {9,a l )'\ l , 

andji(0) be such that (9,^(6)) = 0. 

Let T^(0,7<) denote dtf{9, lr)/di t d lld and M^(9, 1% ) denote dsf (0,7i)/^7^7i, j , V some < j < 
+ where jij is the jth element of 7; and j = denotes no second derivative. Let iV l w (0,7i) denote 

dtY(9,-/i)/d9' and SY(0,7t) denote dsf (9,n)/d9' . Let (9,%...,%) be the one-step GMM estimator. 

Then, for any 9 between 9 and 9q, and j i between 7$ and jio, 

f%Q>li)-T% = o uP (1) , M%(6,%)-M% = o uP (1) , N^^-Nf = o uP (1) , ^(S^-Sf = o uP (1) . 
Also, for any j i0 between 7.^ and 7^0 = 7^(^0)7 

VTtf (^o, 7,o) = (T 1/W ) , VT (f^o, 7 i0 ) - T%) = o uP (TV") , 

^(M^(0o,7 iO )-M^)= O „p (T 1 /!"), 



Proof. The first set of results follows by inspection of the scores and their derivatives (the expressions are 
given in Appendix (Gj, uniform consistency of 7$ by Theorem Q] and application of the first part of Lemma [4] 
to 9* = 9 and a* — cti with a — 0. 



The following steps are used to prove the second set of result. By Lemma 31 

VftT = o uP (rvio) ; fr(9 Q ,iJ -T? = o uP (1) 
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where j i0 is between 7^ and 7^0- Then, a mean value expansion of the FOC of 7,0, <]^(^Oi7io) — 0, around 
7io = 7io gives 

=O tt (l) =o„p(TVio) = „(i) ' ' 
= o u p(T 1 ^ 10 ) + o uP (VT(7,o - Tio)) , 
by Condition [3] and the previous result. Therefore, 

(1 + o uP (l)) VT(^o - 7i0 ) = OupiT 1 / 10 ) => VT(% - 7i0 ) = o uP (T 1 / 10 ). 



Given this uniform rate for 7^ , the desired result can be obtained by applying the second part of Lemma 
Uto 6* = 9 Q and a* = a i0 with a = -2/5. □ 

C.2. Proof of Theorem 2. 

Proof. By a mean value expansion of the FOC for 9 around 9 = 9$, 

where 9 lies between 9 and 9q. 

Part I: Asymptotic limit of ds w (9) / d9' . Note that 

dS w (9) = 1 A dsf(9M0)) 
d9' nf-{ d9' 

(C1] dsf(9,Ud)) = dsW(9,U0)) dsf (9,^(9)) dW) 

1 ■ ' d0' 39' <9 7l ' 9' ■ 

By Lemma El 

d^jOMO)) _ q w , n m dsY(9M0)) _ M w 1 n m 
— -6j +o„ P (l), — -M l +o uP (l). 

Then, differentiation of the FOC for 7i(0), (9,ji(9)) = 0, with respect to 9 and 7, gives 

By repeated application of Lemma [3] and Condition [31 

gg/ = ~ l T i J ^ +o„p(1). 

Finally, replacing the expressions for the components in (|C.1|) and using the formulae for the derivatives, 
which are provided in the Appendix [Gj 

(C2) ^1 = \ £ G& ( P£<3* + = + op(1), Jf = ^P^]. 

i— 1 

Part II: Asymptotic Expansion for 9 — 9q. By (|C.2[) and Lemma |2"21 which states the stochastic expansion 
of VrtTs w (0 o ), 

= V^s w (9 ) + ^r 1 V^T(9-9 Q ). 
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Therefore, \J nT(9 — 9q) = Op(l), and by part I, Lemma 12121 and Condition[3J 

VrtT0 - 6 ) 4 -(jf r 1 a (kBY, V s w ) . 

□ 

C.3. Proof of Theorem 4. 

Proof. Applying Lemma [5] with a minor modification, along with Condition we can prove an exact coun- 
terpart to Lemma [5] for the two-step GMM score for the fixed effects 

where the expressions of and tf are given in the Appendix [Gj and for the two-step score of the common 
parameter 

The only difference arises due to the term if (#,7^), which involves $7,(6*, a^) — fli. Lemma [S] shows that 
VT(hi(9, cti) — — o u p(T 1 / 10 ), so that a result similar to Lemma [5] holds for the two-step scores. 

Thus, we can make the same argument as in the proof of Theorem 2 using the stochastic expansion of 
V hT's(8q) given in Lemma [23] □ 



Appendix D. Asymptotic Distribution of Bias-Corrected Two-Step GMM Estimator 
D.l. Some Lemmas. 

Lemma 6. Assume that Conditions [7] [H [3J [7] and [5] hold. Let £i(#,7i) denote the two-step GMM score for 
the fixed effects, Si(#, 7$) denote the two-step GMM score for the common parameter, and "fi(9) be such that 
U(0,"fi(9)) = 0. Let Ti t j(9,^ii) denote dti(9, / dj^djij , for some < j < d g + d a , where 7^ is the jth 
component of 7$ and j — denotes no second derivative. Let Ni(9,ji) denote dti(9,"fi)/d9' . Let Mij(9,^/i) 
denote d'§i(9,^i)/d"{ l i d^i^ , for some < j < d g + d a . Let Si(9,"fi) denote 8^(9 , 7,) / 89' . Let (0,{7i}™=i) be 
the two-step GMM estimators. 

Then, for any 9 between 9 and 9q, and 7 y i between 7^ and "fto, 

\/r(f M (3,7,)-T M ) = o uP (r^ w y y/T (Mijfon) - MtJ) = o uP (t 1 / 10 ) , 

VT (NiQrfi) - Ni) = o uP (r 1 / 10 ) , Vf (£#,7*) - Si) - o uP (TV") . 

Proof. Let % = %(9) and %o = 7i($o)- First, note that 

Vf(% - 7io) = ^jp-Vf(0- *<>) = -(Tpy 1 NiVf(9- 9a) + o uP (Vf(9- 8 )) = O^- 1 / 2 ). 

=O u (l) =Op(n-V=) 

where the second equality follows from the proof of Theorem 2 and 4. Thus, by the same argument used in 
the proof of Lemma [S] 

Vf(ji - 7l0 ) = VT{% - 7,0) + VT(%o - 7io) = o uP {T l ' w ). 

Given this result and inspection of the scores and their derivatives (see the Appendix [Gj, the proof is similar 
to the proof of the second part of Lemma [5] □ 
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Lemma 7. Assume that Condition]!] holds. Let hj(zit;9,ai), j — 1,2 be two functions such that (i) 
hj(zit;9,cti) is continuously differentiable in (9,a>i) € T C M. de+da ; (ii) T is convex; (Hi) there exists a func- 
tion M(z it ) such that |ftj(.Sii;0,ai)| < M{z lt ) and\dh j {z u ;9,a i )/d(9,a i )\ < M(z it ) withE[M(z it ) w{ - de+da+& ' },< '- 1 ~ Wv)+s ] < 
oo for some S > and < v < 1/10. Define Fi(9,cti) := T~ 1 ^2 t=l h\(zit]9,ai)h2{zit]9,oii), and 
Fi(9,oti) :=^[Fi(0,ai)]. Let 

a* = argsupQf (0*,a), 

a 

smc/i t/iat a* — a^o = o u p(T a °) and 9* — 9q = op(T ae ), with —2/5 < a < 0, /or a = max(o a ,( 
any 9 between 9* and 9q, and tti between a* and aio, 

Fifrcii) - F(0 o ,a io ) = o uP (T a ), VTiF^^i) - Fifrcii)] = o uP (T 1/w )- 



Then, for 



Proof. Same as for Lemma 01 replacing iJ$ by Fj, and M(zn) by M(zu)' 



□ 

Lemma 8. Assume that Conditions]]]]^ 00 <™d[H/io/d. LetQi(9,a i ) = T^ 1 Y^ = i9( z it',9,ai)g(z it ;9,a i y 
be an estimator of the covariance function fl; = E[g(zit)g(zit)'], where 9 = 9q + op(T~ 2 / 5 ) and tti = 
a lQ + o uP {T- 2 / 5 ). Let ^^(0,5*) = 0^+^(9 : ,ai) / 'd dl c^d* 9 \ /or < di + d 2 < 2. Then, 



Proof. Note that 

\g(z it ; 9,ai)g(z it ;9, a,-)' - F [g^a; 0, ai)g(za; 9, ok)'] | 

< d 2 max \g k (z it \9,a i )gi(zit]'9,a l )' - E [g k (zit;9,ai)gi(zit;9,ai)'] I . 

a l<k<l<d g 

Then we can apply Lemma [7] to hi = g k and h 2 = gi with a = —2/5. A similar argument applies to the 
derivatives, since they are sums of products of elements that satisfy the assumption of Lemma [7] □ 

Lemma 9. Assume that Conditions @ and [6] hold, and £ — > oo such that £/T — > as 



Gati (9) Or 1 ^ (5) J , F a , (0) - 
G ai (fl/wr 1 ^ (9)]~\h%($) = 



T — > oo. For any suc/i that between 9 and 9q. let E ai (9) = 
£ Qi (9) G ai (9)'nr\ P ai (9) = Qr 1 - Hr 1 ^ (0) H at (9) , (0) = 
%Z (9) G ai (9)'W~\ J sl (9) = G 6i (9)'P ai (9) G 6i (9) , B°(9) = T^ E- =0 T,t j+ i Ge u {9)'P«m9i,t-M 
and B%(6) = -GeM'i^ii 9 ) + ^W) + ( 9 ) + ^fS 6 ^' where 

d a IT 

-h t (9) G<*«i,i ( )/ 2 + ( ) 12 T ^ 12 ( )> 



oc 



i=o t=i+i 



^ G ait (9)'P*M9i,t-j(9), 

j=0 t=i+l 
£ T 

^W^Tt- 1 ^ gu^guWyPoMdij-iie), 

j=0 t=j+l 

P a M^KAHX^)-H ai ,{9)l 

3=1 
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be estimators o/£ Qi , H ai ,P ai , H^f . J s i, and B^. Let F a d ig d 2i (9,ai(9)) and F a d lS d 2i (9,ai), with 
F e {E,H,P, E W ,H W , J sl ,B^,B^} denote their derivatives for < d x + d 2 < 1. Then, 

VT (F a d lf) d. 2l (5,Oi(5)) - i^ ied2l ) - o MP (t 1 / 10 ) . 
where F a d lS d 2i :— F if d± + c?2 = 0. 

Proof. The results follow by Theorem [3] and Lemma [BJ using the algebraic properties of the o u p orders and 
Lemma 12 of HK to show the properties of the estimators of the spectral expectations. □ 

Lemma 10. Assume that Conditions [7J [H [3J [7J [3J and\Qhold. Then, for any 9 between 9 and 6q, 

J s (9) =J s + o P {T- 2 ' 5 ). 

Proof. Note that 

VT \G ei @)'P ai (6)G ei (6) - G' 0i P ai G e ] = o uP (T^ w ), 

by Theorem 13] and Lemmas [5] and [HI using the algebraic properties of the o u p orders. The result then follows 
by a CLT for independent sequences since 

n 

J s (9) - J s = E[Ge 1 (9) l P a% (9)Ge i (§)] - E[G' ei P ai G di ] = nT 1 £ (G'g^Ge, - E[G' e% P^G e S) + o uP (T~ 2 ^). 



i=i 



□ 



Lemma 11. Assume that Conditions [7J [H [3J [7J [3J and\Qhold. Then, for any 9 between 9 and 9q, 

B s (e)=B s + o P {T- 2 / 5 ). 

Proof. Analogous to the proof of Lemma ITU1 replacing J s by B s . 



□ 



Lemma 12. Assume that Conditions]]^ @ @ and [6] hold. Then, for any 9 between 9 and 9q, and 
23 = -J-'Bs, 

%{9) = -J s (ey l B s {6) = 23 + o P (T~ 2 / 5 ). 



Proof. The result follows from Lemmas 1101 and II H using a Taylor expansion argument. 

D.2. Proof of Theorem 5. 

Proof. Case I: C = BC. By Lemmas [TD] and US 



□ 



'nT 



= -J s (9) s(9 Q )=-J- 1 s(9 ) + op(T- 2 / 5 )O 



-J- 1 s(e ) + o P (l). 



Then, by Lemmas [T2l and [231 



'nT 



]BC 



T 



-9 ) = V^T (e- 9 ) - VnT-% (0) = -J^s(9 ) + «/ = J^fl, + o P (l) 



= -j: 



B s - \l —B s 



+ 0p (i)4tv(o,j s - 1 ). 



Case II: C = SBC. First, note that since the correction of the score is of order Op(T x ), 9 SBC 
Op(T _1 ). Then, by a Taylor expansion of the corrected FOC around 9 SBC = 9q 



= s 



(9 SBC ) 



T- X B S 



jSBC 



a\ laSBC 



s(6 ) + J s {0) (9 



%)-T- 1 B s +o P (T- 
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where 9 lies between Q SBC and 0q. Then by Lemma [ 



/ nT (o SBC — 9q^J = -J s (9) 
= -J s {0) 



^nTs^o) - n 1/2 T- 1/2 B s ] + o P (l) 



1 \ ~ Fn l~n r, 



+ op(l)4 7V(0,J7 1 ). 



Case III: C = IBC. A similar argument applies to the estimating equation (|5.2[) . since 9 IBC is in a 
0(T _1 ) neighborhood of 9 . □ 

Appendix E. Stochastic Expansion for 7 l0 = 7 i(0o) AND 7io = 7i(#o) 

We characterize the stochastic expansions up to second order for one-step and two-step estimators of the 
individual effects given the true common parameter. We only provide detailed proofs of the results for the 
two-step estimator 740, because the proofs the one-step estimator 7^0 follow by similar arguments. Lemmas 1 
and 2 in the main text are corollaries of these expansions. The expressions for the scores and their derivatives 
in the components of the expansions are given in Appendix [Gj 

Lemma 13. Suppose that Conditions{^\^\^[ and^ hold. Then 

Vt(j m - 7,0) = i>V + t-^rZ A N(Q, V™), 



whe 



l t=i 



Also 



(Try 1 vfty=o uP (T i/i % RY i =o uF (T i/ % vr=E[^^'\ 

■4E^ = Op(1). 



Proof. We just show the part of the remainder term because the rest of the proof is similar to the proof of 
Lemma [TBI By the proof of Lemma [SJ VT^io — 7,0) = o u p(T 1 / 10 ) and 

Ifi = -(Tf V (fr(0o, 7,o) - T^ y/Tfa-yo) = o uP {T^). 

=o„_p(T 1 /io) 



=O u (l) 



=o„p(TV10) 



□ 



Lemma 14. Suppose that Conditions [7J fJl [3J and^hold. Then, 

VT(j z0 - 7i0 ) = W + T-V*Q% + T-iR%, 



whe 



AT = 



d g +d a 



Also, 



3=1 

T( r u -T^) = o uP (T 1 / 1 % R%=o uP (TV 10 ) 
l -±QZ = P {l). 



= o u p(T^ 5 ), 
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Proof. Similar to the proof of Lemma [T5] 

Lemma 15. Suppose that Conditions [7J [Jl [3J and^hold. Then, 

n 1 n 

-L £ ^ 4 tf(0, i £ QJT A + <' G + B 



□ 



_, pW 



where 
V w 



B Z J 



HZ 

PS , 



\ OCi ' Oli J 



HZ 

pW 



D 



W,G 



J2 E[G a Mt)HZg(^)] - Y^g^hZ^hZ' /2 

J2 EiGM'P^gizi,^)], 

j=—oo 

f^G' aai j PZn t HZ'/2 + J2 ® ^)if^P^ ./2 

o 

]T s M^PZai^t-j)] , 

j=—oo 

for = {G' at Wr 1 G ai )~ X , ff£ = Z^G'^Wr 1 , and P™ = W~ x - W^G^H™. 



D, 



w,is 
n 



BZ° 

pW,G 
pW,lS 

TjW.lS 



h£ t 

-HZ' 
' HZ X v 

, p5 



Proof. The results follow from Lemmas [121 and 1141 noting that 



, / _yW ttW 

(tZ) = - ' 



W pW 



H w p 



<l>. 



w 



HZ) f , 



E 



E 



7W 7W' 



OLi 

pW 



Q. ( ff W P w , 



E 



E[G Ql (z tt yPZ9(^)] 
E [G^tYHZg^t-j)] + E [^(z lt )P^g( Zht ^)] 



_ ( g^ pZ^hZ' 



E 



r.Wrpwjw 

> i,j Yi 



g'hz^hz: r 



if j < d a ; 



G' aai (I da ® e,-_ d a )H%SUP% tj , if j > d Q 



□ 



Lemma 16. Suppose that Conditions [7J [H [7J El and&hold. Then, 

Vf(% - 7j0 ) = & + T _1 / 2 i?ij 4 AT(0, V5), 

w/iere 



-Lf;^ = -(3i n )" 1 ^ , = o«p(r 1/10 ), Ru = o uP (T^ 5 ), Vi = E$M' 
^ t=i 

1 " 

v i=l 
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Proof. The statements about ipi follow by the proof of Lemma [3] applied to the second stage, and the CLT 
in Lemma 3 of HK. From a similar argument to the proof of Lemma [3 

Ru = -(2f r 1 Vf(fp(e ,^) - if) VT(%o - 7,o) - (iff 1 VTif^eo^) ~ if) VT(% - 7*0) 

=O u (l) =o„j,(Ti/iO) =OuP ( T i/i0) =0„(1) =o„p(TV10) =o„ j ,(T 1 /io) 



by Conditions [3] and |U 

Lemma 17. Assume that Conditions [7J [H [7] and[S[ hold. Then, 
where 



□ 



3 = 1 



/5\ 



and ip^j is the jth element of ip^. 

Proof. By a mean value expansion around (#o,a;o): 



3 = 1 



3 = 1 



where (9, c?i) lies between (9, &i) and (9q, octo). The expressions for ip^ can be obtained using the expansions 
for 7^0 in Lemma [T3l since ji — 7^0 = o u p(T~ 3 / 10 ). The order of this term follows from Lemma [T3l and the 
CLT for independent sequences. The remainder term is 

d a dg 

RZi = E ( -'.. R uj + (§&) - Q a ,jVT(a id - aioj)] + £ n 9j (9,^(9, - 9 0tj ). 

3=1 3=1 
The uniform rate of convergence then follows by Lemmas [S] and [T31 and Theorem [TJ 



□ 



Lemma 18. Suppose that Conditions [7J [H [3J [7J and[3|/io/rf. Then, 



(E.l) 
where 



T(j i0 - 7*0) = ^ + T- 1/2 Qi s : + T^ifcj 



Qu(ipi, ai) 
A? 



(if) 



J' =1 

T(Tf - TP) = o uP (t 1 ' 10 ) , ^ = „ P ( T 3 / 10 ) . 
1 " 

-Vgi,=Op(i). 



o«p (t 1 



/5 



Proof. By a second order Taylor expansion of the FOC for 7^0 , we have 

_^ d g +d a 

= ?i(0o,7io) = *f + ^(7,0 - 7 l0 ) + ^ H (7io,j ~ Ho,j) T iA e o,li){%o ~ 7io), 
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where 7 y i is between 7^0 and 7^0- The expression for Qu can be obtained in a similar fashion as in Lemma 
A4 in Newey and Smith (2004). The rest of the properties for Qu follow by Lemma [5] applied to the second 
stage, Lemma [TBI and an argument similar to the proof of Theorem 1 in HK that uses Corollary A. 2 of Hall 
and Heide (1980, p. 278) and Lemma 1 of Andrews (1991). The remainder term is 



R21 — - (Tj 



Ig+do, 

2=1 



7,0,^(1^.(00,7,) - T£ )VT(% - 7l0 )/2 



- (If)" 1 £ ^0,: 

2=1 

- (if) _1 [dza 5 [0, iJ^]v^(7i0 " 7«) + diag[0, $£]P W 
The uniform rate of convergence then follows by Lemmas [5] and [TBJ and Conditions [3] and 0J 
Lemma 19. Suppose that Conditions [3 [H [7J [3J and r/j| Then, 

n 1 n 



i=i 



where 



V, 



Bi 



B" = 



diag(E ai ,P ai ) , 

(*)■ 



B: 



B 



w 



B ai 
-So 

H, 



/2 + S[G Ql (z It )^a,3(^ 



2 = 1 



£) J E[G Qi (^ t )'P ajff ( 



^i,t-j) 



2=0 



\^2 l E\g{z it )g{zu)'P ai g{zi,t-j)], 



H, 

P, 



2=1 



/or E Qi = (G' fi~ GcJ , B ai = S Q4 G' flr 1 , and P Qi = ft" - 9,^G ai B ai . 



□ 



Proof. The results follow by Lemmas [TH] and [TH1 noting that 



V < p « 
s Qj \ 
p q .. / 



■0** = - 



H a 



E 



= E 

2=0 



S[G Qi (z it )'P ai9 (^,t_ i )] 
£;[G Q ,(z lt )'^a,5(^,t-,)] 



E 



E 



diag[0,<JV 



0. 



if j < d Q ; 
if j > do,. 







□ 
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Appendix F. Stochastic Expansion for (0 o ,7o) and si(0O)7io) 

We characterize stochastic expansions up to second order for one-step and two-step profile scores of the 
common parameter evaluated at the true value of the common parameter. The expressions for the scores 
and their derivatives in the components of the expansions are given in Appendix [G] 

Lemma 20. Suppose that Conditions [3 [H [31 and^hold. Then, 

SWra z, \ — T-XllXW , rp-i n w , ti-3/2 t>W 

where 

d g +d a 

Cf = VT(Mr-Mr) = o uP (T 1/W ), R? sl = o uP (T 2/5 )- 

Also, 

1 n n 
* i=l i=l 



Proof. By a second order Taylor expansion of s^{6q, 70) around 70 — jio, 

d s +d a 

Sf (Oo, 7o) = Sf + (7 M - 7 io) + 2 E " 7^)^(00, T^o ~ 7io), 

j'=i 

where 7 is between 70 and 7,0 • Noting that s| 1/ (#o?7io) = and using the expansion for 70 in Lemma 
[14l we can obtain the expressions for ifj^ and Qx a i, after some algebra. The rest of the properties for these 
terms follow by the properties of ip^ and ■ The remainder term is 

RZ = MfRZ + crnZ + \ ^ [rZ 3 m%Vt{^ - 70) + $Z M % R u 

d g +d a 

The uniform order of R^si follows by the properties of the components in the expansion of 70, Lemma [SJ 
and Conditions [3] and |4j □ 



Lemma 21. Suppose that Conditions [2 [?1 and^hold. We then have 
1 " 

»=i 
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Proof. The results follow by Lemmas and [131 noting that 



E 



E 



i, w i, w ' 

i SI t 81 



M 



w 



= £ ElGoM'PZgi^t-i)} 



E 



/-it pWn.if 



^o.h-W ifj<d a ; 



□ 



Lemma 22. Suppose that Conditions\]\\j^\^ and\4\hold. Then, forl3 W (9o) = n 1 (#0j 7io)j 

where By and are defined in Lemma [2l[ 
Proof. By Lemma 1201 

1 ™ 




v i=l v i=l 



ir 

2.s< 



1 ™ 



=O p (1) 

"1 

Tn 



— 1 " 



=o P (l) 
P (1). 



□ 



Then, the result follows by Lemma I2T1 

Lemma 23. Suppose that Conditions [3 [3 [7J El andE|/io/d. Then, 

s;(0o,7;o) = T- 1 / 2 ^ + T- x Q lsi + T- s / 2 R 2si , 

where all the terms are identical to that of Lemma \20\ after replacing W by f2. Also, the properties of all the 
terms of the expansion are the analogous to those of Lemma \2(A 



Proof. The proof is similar to the proof of Lemma [201 

Lemma 24. Suppose that Conditions Ql El afid[6|/io/d. Then, 

1 n 

-7=$^" ^ N(0,J s ), J s = E[G' ei P ai G 9i ] 



□ 



1 n 

77 ' 



i = l 



= So.G^nr 1 , and S Qj = (G^fl^G^y 1 . 
Proof. The results follow by Lemmas [THl [TSJ [T5] and [531 noting that 



Pa, — 1— 1 G ai H ai , 



E 



= M? I Sqi ° ) Aff , £ 
I P m / 



J=0 



= 0. 



□ 
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Lemma 25. Suppose that Conditions [7J [Jl [3J [5l and^hold. Then, or s(#o) = n 1 X)"=i ^i(9o, 7io), 

v^Ts(#o) = -t= V + A& s + o P (l) A TV ( K B S , J s ) , 
where ip 8 i and B s are defined in Lemmas \23\ and \24\ respectively. 

Proof. Using the expansion form obtained in Lemma [2"3I we can get the result by examining each term with 
Lemma [24] □ 

Appendix G. Scores and Derivatives 

G.l. One-Step Score and Derivatives: Individual Effects. We denote dimensions of g(za), a*, and 9 
by d g , d a and dg. The symbol <g> denotes kronecker product of matrices, and Id a denotes a d Q -order identity 
matrix. Let G aai {z it ; 9, a t ) := (G aai A (z it ; 9, on)', G aai <ia (z it ; 9, a,)')', where 

dG ai (zi t ;9,ai) 



Gaaij [Zit; 9, ai) 



da; 



We denote derivatives of G aai {zit',9,oii) with respect to a.y by G aa>ai . (zu; 6, ct!j), and use additional sub- 
scripts for higher order derivatives. 



G.l.l. Score. 



a. , 

T \ g(z it ;9,ai) + W l X l 



G ai (6,ai)% 
g t (9,a t ) + WiXi 



G.1.2. Derivatives with respect to the fixed effects. 
First Derivatives 



fy(e,ji) 



T 



w 



a*T(7i,0) _ ( Gaa,{9,ai)'(I da ®\i) Ga^Oi)' 

H \ Ga t (9,a t ) Wi 

_( G' a A 

V G ai W t J ■ 



E 



T 



ttW' pW 

Oil Oti 



Second Derivatives 



= < 



rp\' 



E 



T%(7io;9 ) 



G aa ,a i:j (9, Cti)'(Id a ® Ai) Gaai j {9, <*»)' 

G aaiJ (6,ai) 

G aai (6,ai)'(Iii a ® e j-<0 \ 

J ' 



GUj \ 5 

Gaaij y 

<W J *«. ® e i-<0 




if j < d a ; 
, iij>d a . 



if j < d a ; 
if j > d a . 
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Third Derivatives 



d-fi^d-fijd-fl 



= E 



i,jk 



= < 



(8, ai)'(I da <g> Ai) G aaoii jk (8, cxi)' 

Gaaa iJk (d,Cti) 

G aa , aiJ (6, ai)'(I da <g> e fe _ d J 

(8, cti)' (Id a <£> ej-d a ) 










o g: 



G aaaijk 









G' aa , aitk (Id a ®ej- da ) 



o o\ 

0' 





if j <d a ,k< d a : 
if j <d a ,k> d a ; 
if j > d a ,k< d a ; 
if j > d a , k > d a . 



if j < d a ,k < d a ; 
if j <d a ,k> d a ; 
if j > d a ,k< d a : 
if j > d a ,k > d a . 



G.1.3. Derivatives with respect to the common parameter. 
First Derivatives 



JW7i) = 



88, 



N Z = E 



N, 



w 



Ge jai {9, Oii)'\i 
Ge itj (8,ai) 



G 



G.2. One-Step Score and Derivatives: Common Parameters. Let G$ ai (zit;0,ai) :- 
{G eaiA {zit;0, cti)', G datda (z it ;8, a,)')', where 

dGe{zu--,0-,ai) 



Ge ai Jz it ;9,ai) = 



dctij 



We denote the derivatives of G$ ai (zit;0,ai) with respect to ctij by Ge a ,a id {zit]8,ai), and use additional 
subscripts for higher order derivatives. 



G.2.1. Score. 

1 T 

3T(0.7i) = ~f 0,0,)% = -Ge^aiYXi. 

t=i 



G.2.2. Derivatives with respect to the fixed effects. 
First Derivatives 
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Second Derivatives 



03S"(0,7i) 



M. 



= - G eai (e,a l y{i da (g>x l ) Ge z (e, ai y 



G' 



Third Derivatives 



M^(0 o , 7l o) 



Ge a , ai::j {0, oti)'{Id a ® k) G eaiJ {9, an) 
Gea^^iY^da ®ej-dj ) , 

G' eoti . ), Hj<d a ; 
G'eaMdc, ® e j-dJ o), if i > da- 



if j < d Q ; 
if j > d Q . 



(8 , cti)' (Id a <Z> \i) Gg aai jk (8,aiY 
Ge a , ati] {0,ai)'{ld a ® e k -d a ) J, 
Gea, ai<k (0,a i y(ld ai ^e j -d ai ) 




M, 



w 



i,jk 



= E 



M, 



w 

i,jk 



-10 G' 



QotQLj 



if 3 <d ai k< d Q ; 
if 3 <d a ,k> d a ; 



— < 



G> 9a a ■ JJdc ® e j-d a ) ) , if J > d a , k < d Q ; 



-00 



if j > d a , k > da- 



if 3 <d ai k< d a ; 
if 3 < d a ,k > d a ; 
if j > d a ,k < d a ; 
if j > d a , k > d a . 



G.2.3. Derivatives with respect to the common parameters. 
First Derivatives 



03T(0,7i) 



S 



w 



= 0. 



-Gee itj {9,ai)' \i 



G.3. Two-Step Score and Derivatives: Fixed Effects. 



G.3.1. Score. 
ti(9,7i) 



1 T 



G ai (zit;0,ai)'Xi 



= t?(0,7i)+*?(0,7i 



G^e^iYXi 

gi(9,ai) + CliXi 







Note that the formulae for the derivatives of Appendix IG . 1 1 apply for ip, replacing W by £1. Hence, we only 
need to derive the derivatives for W. 
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G.3.2. Derivatives with respect to the fixed effects. 
First Derivatives 



Tf = E 



( o \ 



E 



Second and Third Derivatives 

Since T/^Tj, 9) does not depend on 7.;, the derivatives (and its expectation) of order greater than one are 
zero. 

G.3.3. Derivatives with respect to the common parameters. 
First Derivatives 

G.4. Two-Step Score and Derivatives: Common Parameters. 

G.4.1. Score. 



«i(0)7i) = ~ h, G o fat* ®i = -Ge i (9,a l )'X i . 

t=i 

Since this score does not depend explicitly on Qi(9,cti), the formulae for the derivatives are the same as in 
Appendix IG.21 
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Table Al: Common Parameter 9 2 



Pi = pi = 0.3 pi = 0.6 pi = 0.9 

Estimator Bias SD SE/SD p;.05 Bias SD SE/SD p;.05 Bias SD SE/SD p;.05 Bias SD SE/SD p;.05 

i) = 2 





-FC 


0.06 


0.01 


0.84 


1.00 


0.06 


0.01 


0.83 


1.00 


0.06 


0.01 


0.84 


1.00 


0.07 


0.01 


0.71 


1.00 


jy- 


-FC 


0.00 


0.01 


0.90 


0.08 


-0.01 


0.02 


0.84 


0.11 


-0.01 


0.02 


0.78 


0.18 


-0.01 


0.02 


0.63 


0.28 




-RC 


0.04 


0.01 


0.97 


1.00 


0.04 


0.01 


0.99 


1.00 


0.04 


0.01 


1.02 


1.00 


0.04 


0.01 


0.96 


1.00 




OLS 


0.04 


0.01 


0.97 


1.00 


0.04 


0.01 


0.99 


1.00 


0.04 


0.01 


1.02 


1.00 


0.04 


0.01 


0.96 


1.00 


JBC- 


-OLS 


0.04 


0.01 


0.97 


1.00 


0.04 


0.01 


0.99 


1.00 


0.04 


0.01 


1.02 


1.00 


0.04 


0.01 


0.96 


1.00 


jv - 


-RC 


0.00 


0.01 


1.00 


0.06 


0.00 


0.01 


1.01 


0.05 


0.00 


0.01 


1.00 


0.05 


0.00 


0.01 


1.01 


0.05 


BC- 


-IV 


0.00 


0.01 


0.99 


0.06 


0.00 


0.01 


1.01 


0.05 


0.00 


0.01 


1.00 


0.05 


0.00 


0.01 


1.00 


0.05 


IBC 


-IV 


0.00 


0.01 


0.99 


0.06 


0.00 


0.01 


1.01 


0.05 


0.00 


0.01 


1.00 


0.05 


0.00 


0.01 


1.00 


0.05 



i> = 4 



OLS 


-FC 


0.12 


0.01 


1.09 


1.00 


0.12 


0.01 


1.03 


1.00 


0.12 


0.01 


1.10 


1.00 


0.12 


0.01 


1.07 


1.00 


IV- 


-FC 


0.00 


0.02 


0.94 


0.07 


-0.01 


0.02 


0.89 


0.08 


-0.01 


0.02 


0.92 


0.09 


-0.01 


0.03 


0.79 


0.15 


OLS 


-RC 


0.10 


0.01 


1.06 


1.00 


0.10 


0.01 


1.05 


1.00 


0.11 


0.01 


1.08 


1.00 


0.11 


0.01 


1.07 


1.00 


BC - 


OLS 


0.10 


0.01 


1.06 


1.00 


0.10 


0.01 


1.05 


1.00 


0.11 


0.01 


1.08 


1.00 


0.11 


0.01 


1.07 


1.00 


IBC- 


-OLS 


0.10 


0.01 


1.06 


1.00 


0.10 


0.01 


1.05 


1.00 


0.11 


0.01 


1.08 


1.00 


0.11 


0.01 


1.07 


1.00 


IV - 


-RC 


0.00 


0.02 


0.98 


0.06 


0.00 


0.02 


0.96 


0.06 


0.00 


0.02 


1.01 


0.05 


0.00 


0.02 


1.00 


0.06 


BC - 


-IV 


0.00 


0.02 


0.97 


0.05 


0.00 


0.02 


0.95 


0.06 


0.00 


0.02 


1.00 


0.05 


0.00 


0.02 


0.99 


0.06 


IBC 


-IV 


0.00 


0.02 


0.97 


0.05 


0.00 


0.02 


0.95 


0.06 


0.00 


0.02 


1.00 


0.05 


0.00 


0.02 


0.99 


0.06 



tp = 6 



OLS- 


-FC 


0.16 


0.01 


1.27 


1.00 


0.16 


0.01 


1.22 


1.00 


0.16 


0.01 


1.25 


1.00 


0.16 


0.01 


1.34 


1.00 


IV- 


FC 


0.00 


0.03 


0.95 


0.06 


0.00 


0.03 


0.94 


0.06 


-0.01 


0.03 


0.92 


0.08 


-0.01 


0.04 


0.92 


0.08 


OLS- 


-RC 


0.15 


0.01 


1.20 


1.00 


0.15 


0.01 


1.21 


1.00 


0.15 


0.01 


1.21 


1.00 


0.15 


0.01 


1.26 


1.00 


BC - 


OLS 


0.15 


0.01 


1.20 


1.00 


0.15 


0.01 


1.21 


1.00 


0.15 


0.01 


1.21 


1.00 


0.15 


0.01 


1.26 


1.00 


IBC - 


OLS 


0.15 


0.01 


1.20 


1.00 


0.15 


0.01 


1.21 


1.00 


0.15 


0.01 


1.21 


1.00 


0.15 


0.01 


1.26 


1.00 


IV - 


RC 


0.00 


0.03 


0.98 


0.06 


0.00 


0.03 


1.00 


0.04 


0.00 


0.03 


1.01 


0.05 


0.00 


0.03 


1.05 


0.04 


BC - 


-IV 


0.00 


0.03 


0.95 


0.06 


0.00 


0.03 


0.97 


0.04 


0.00 


0.03 


0.98 


0.05 


0.00 


0.03 


1.02 


0.04 


IBC - 


-IV 


0.00 


0.03 


0.95 


0.06 


0.00 


0.03 


0.97 


0.04 


0.00 


0.03 


0.98 


0.05 


0.00 


0.03 


1.02 


0.04 



RC/FC refers to random/fixed coefficient model. BC/IBC refers to bias corrected/iterated bias corrected estimates. 
Note: f, 000 repetitions. 
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Table A2: Mean of Individual Specific Parameter jii = E[au\ 



Pi = pi = 0.3 pi = 0.6 pi = 0.9 

Estimator Bias SD SE/SD p;.05 Bias SD SE/SD p;.05 Bias SD SE/SD p;.05 Bias SD SE/SD p;.05 

^ = 2 





-FC 


2.33 


1.65 


0.35 


0.78 


2.58 


1.91 


0.31 


0.79 


3.01 


1.92 


0.30 


0.84 


3.68 


2.20 


0.27 


0.89 


n/ 

IV — 


r o 


n ns 

U.Uo 


i.oy 


n An 


n 44 


n 1 fi 

U. IO 


1 79 


n ^7 

U.O 1 


n 47 


n zlr 

U.'lU 


l.OO 


n 

u.oy 


n ap 


n qp 
u.yo 


1 sn 

l.OU 


n ^7 

U.O 1 


n ^ 
u.oo 


DT Q 


— lWj 


1 1(i 
1. 10 


1 .00 


1 no 
i .uz 


n 1 9 

U. 1Z 


1.10 


1.00 


n qp. 
u.yo 


n 1 9 
u. 1Z 


1 1 Q 

i . iy 


i.oy 


n qq 
u.yy 


n 1 9 

U. 1Z 


1 .ZO 


1 PO 
l.OZ 


n Q7 
u.y 1 


n iq 

U. 10 


-D Ly — 


nr c 


1 IK 

1 . io 


1 ^ 


n Q7 
u.y t 


n 1 A 

U. 1^1 


1 . IO 


l.OO 


n no 
u.yz 


n 1 A 

U. 1^1 


i in 
i . iy 


i.oy 


n o.^ 
u.yo 


n 14 

U. 1^1 


1 9^ 
1 .ZO 


1 P9 
l.OZ 


n 

u.yo 


n i c. 

U. IO 


t 

1 D\_j - 


dt q 


1 IK 
1 . IO 


i .oo 


n Q7 
u.y / 


n i zL 

U. 1^1 


1 . IO 


l.OO 


n no 
u.yz 


n 1 4 

U. 1^1 


i in 
i . iy 


i.oy 


n Q.^ 
u.yo 


n 1 4 

U. 1^1 


1 9^ 
1 .ZO 


1 PO 
l.OZ 


n 

u.yo 


n i c. 

U. IO 


TV 

IV 


JXLs 


n m 

U.Ul 


l.dl 


1 n7 

1 .U 1 


n nzL 


n m 

-U.Ul 


1 R9 
l.OZ 


1 nn 

l.UU 


n n^ 

U.UO 


n no 
u.uz 


1 KP. 

l.OO 


1 nzL 

1 .U^l 


n n^ 

U.UO 


n ns 

U.Uo 


1 KQ 
i.oy 


1 n^ 
1 .uo 


n n^ 

U.UO 




TV 

— IV 


n m 

-U.Ul 


1 ^1 
l.dl 


1 no 

1 .uz 


n c\A 


n n9 
-u.uz 


1 fi9 
l.OZ 


n qr 
u.yo 


n nfi 

U.UO 


n nn 
u.uu 


l.OO 


1 nn 

l.UU 


n nfi 

U.UU 


n nfi 
u .uo 


1 KQ 
i.oy 


n Q8 
u.yo 


n nfi 

U.UU 




TV 

— IV 


n m 

-U.Ul 


1 ^1 

l.dl 


1 n9 

1 .uz 


n nzL 


n n*? 

-U.UO 


1 fi9 
l.OZ 


n Qfi 
u.yo 


n nfi 

U.UO 


n nn 
u.uu 

A 


1 KP 

l.OO 


1 nn 

l.UU 


n nfi 

U.UU 


n nfi 
u .uo 


i.oy 


n Q8 
u.yo 


n nfi 

U.UU 


OLS 


-FC 


4.15 


1.84 


0.52 


0.90 


4.43 


1.95 


0.49 


0.90 


= 4 
4.90 


2.08 


0.46 


0.93 


5.45 


2.22 


0.43 


0.95 


IV - 


-FC 


0.09 


1.85 


0.59 


0.25 


0.21 


1.92 


0.57 


0.27 


0.56 


1.89 


0.58 


0.27 


1.03 


1.94 


0.57 


0.32 


OLS 


-RC 


3.19 


1.76 


1.06 


0.41 


3.12 


1.81 


1.04 


0.38 


3.12 


1.76 


1.07 


0.38 


3.18 


1.78 


1.06 


0.38 


BC- 


OLS 


3.19 


1.76 


0.93 


0.50 


3.12 


1.81 


0.91 


0.48 


3.12 


1.76 


0.94 


0.47 


3.18 


1.78 


0.93 


0.47 


IBC- 


-OLS 


3.19 


1.76 


0.93 


0.50 


3.12 


1.81 


0.91 


0.48 


3.12 


1.76 


0.94 


0.47 


3.18 


1.78 


0.93 


0.47 


IV - 


-RC 


0.06 


1.78 


1.15 


0.03 


-0.01 


1.86 


1.10 


0.03 


0.03 


1.78 


1.15 


0.03 


0.10 


1.78 


1.15 


0.03 


BC- 


-IV 


0.00 


1.78 


1.02 


0.05 


-0.08 


1.86 


0.98 


0.05 


-0.04 


1.78 


1.02 


0.05 


0.03 


1.78 


1.02 


0.05 


IBC 


-IV 


-0.01 


1.78 


1.02 


0.05 


-0.08 


1.86 


0.98 


0.05 


-0.04 


1.78 


1.02 


0.05 


0.03 


1.78 


1.02 


0.05 



ip = 6 



OLS- 


-FC 


5.62 


2.13 


0.62 


0.93 


5.87 


2.25 


0.58 


0.92 


6.19 


2.31 


0.57 


0.93 


6.35 


2.28 


0.57 


0.95 


IV - 


FC 


0.14 


2.29 


0.69 


0.17 


0.26 


2.34 


0.68 


0.19 


0.53 


2.31 


0.69 


0.19 


0.80 


2.26 


0.70 


0.20 


OLS- 


-RC 


4.69 


2.10 


1.08 


0.53 


4.59 


2.14 


1.07 


0.51 


4.52 


2.11 


1.09 


0.50 


4.30 


2.01 


1.15 


0.46 


BC- 


OLS 


4.69 


2.10 


0.88 


0.69 


4.59 


2.14 


0.88 


0.67 


4.52 


2.11 


0.89 


0.64 


4.30 


2.01 


0.94 


0.61 


IBC- 


-OLS 


4.69 


2.10 


0.88 


0.69 


4.59 


2.14 


0.88 


0.67 


4.52 


2.11 


0.89 


0.64 


4.30 


2.01 


0.94 


0.61 


IV- 


RC 


0.09 


2.30 


1.12 


0.04 


0.05 


2.33 


1.11 


0.03 


0.03 


2.23 


1.16 


0.02 


-0.10 


2.18 


1.19 


0.02 


BC- 


-IV 


-0.05 


2.30 


0.95 


0.06 


-0.10 


2.33 


0.94 


0.06 


-0.12 


2.23 


0.97 


0.06 


-0.26 


2.18 


1.00 


0.05 


IBC 


-IV 


-0.06 


2.30 


0.95 


0.06 


-0.10 


2.32 


0.94 


0.06 


-0.13 


2.23 


0.97 


0.06 


-0.26 


2.18 


1.00 


0.05 



RC/FC refers to random/fixed coefficient model. BC/IBC refers to bias corrected/iterated bias corrected estimates. 
Note: 1, 000 repetitions. 



Table A3: Standard Deviation of the Individual Specific Parameter o\ = E[(ctu — /ii) 2 ] 1 / 2 



pi = pi = 0.3 pi = 0.6 pi = 0.9 

Estimator Bias SD SE/SD p;.05 Bias SD SE/SD p;.05 Bias SD SE/SD p;.05 Bias SD SE/SD p;.05 





-RC 


0.01 


1 


.06 


1 


.02 


0.05 


0.15 


1 


.06 


1 


.02 


0.05 


0.11 


1 


.08 


0.99 


0.06 


0.17 


1 


.06 


0.99 


0.06 


BC- 


OLS 


-0.63 


1 


.10 


1 


.04 


0.10 


-0.48 


1 


.11 


1 


.04 


0.09 


-0.52 


1 


.12 


1.01 


0.10 


-0.46 


1 


.11 


1.01 


0.09 


75(7 - 


- OLS 


-0.63 


1 


.10 


1 


.04 


0.10 


-0.48 


1 


.11 


1 


.04 


0.09 


-0.52 


1 


.12 


1.01 


0.10 


-0.46 


1 


.11 


1.01 


0.09 


J\/- 


-RC 


0.38 


1 


.08 


1 


.03 


0.05 


0.47 


1 


.10 


1 


.02 


0.06 


0.41 


1 


.13 


0.98 


0.06 


0.46 


1 


.11 


0.99 


0.06 




-IV 


-0.25 


1 


.13 


1 


.05 


0.06 


-0.16 


1 


.14 


1 


.04 


0.06 


-0.22 


1 


.18 


1.00 


0.07 


-0.17 


1 


.16 


1.00 


0.06 


JBC 


-IV 


-0.25 


1 


.13 


1 


.05 


0.06 


-0.16 


1 


.14 


1 


.04 


0.06 


-0.22 

= 4 


1 


.18 


1.00 


0.07 


-0.17 


1 


.16 


1.00 


0.06 


OLS 


-RC 


0.89 


1 


.21 


1 


.17 


0.04 


0.98 


1 


,20 


1 


.17 


0.05 


1.08 


1 


.16 


1.19 


0.06 


1.09 


1 


.05 


1.23 


0.08 


50- 


OLS 


-1.24 


1 


.46 


1 


.19 


0.08 


-1.13 


1 


.44 


1 


.20 


0.08 


-1.02 


1 


.41 


1.20 


0.05 


-0.98 


1 


.23 


1.24 


0.03 


IBC- 


-OLS 


-1.24 


1 


.46 


1 


.19 


0.08 


-1.13 


1 


.44 


1 


.20 


0.08 


-1.02 


1 


.41 


1.20 


0.05 


-0.98 


1 


.24 


1.22 


0.03 


IV- 


-RC 


1.84 


1 


.28 


1 


.17 


0.17 


1.83 


1 


.29 


1 


.16 


0.16 


1.85 


1 


.26 


1.17 


0.18 


1.87 


1 


.18 


1.17 


0.20 


BC- 


-IV 


-0.25 


1 


.52 


1 


.20 


0.03 


-0.26 


1 


.52 


1 


.19 


0.02 


-0.26 


1 


.51 


1.18 


0.03 


-0.21 


1 


.37 


1.18 


0.02 


IBC 


-IV 


-0.25 


1 


.52 


1 


.20 


0.03 


-0.26 


1 


.52 


1 


.19 


0.02 


-0.26 


1 


.51 


1.18 


0.03 


-0.21 


1 


.38 


1.16 


0.02 


OLS 


-RC 


2.35 


1 


.33 


1 


.38 


0.14 


2.60 


1 


.40 


1 


.30 


0.21 


= 6 
2.57 


1 


.37 


1.31 


0.21 


2.69 


1 


.31 


1.28 


0.38 


BC - 


OLS 


-2.06 


2 


.04 


1 


.41 


0.00 


-1.71 


2 


.14 


1 


.30 


0.01 


-1.75 


2 


.06 


1.35 


0.00 


-1.54 


1 


.78 


1.28 


0.00 


IBC- 


-OLS 


-2.06 


2 


.04 


1 


.41 


0.00 


-1.71 


2 


.14 


1 


.30 


0.01 


-1.75 


2 


.06 


1.35 


0.00 


-1.54 


1 


.80 


1.26 


0.00 


IV- 


-RC 


3.79 


1 


.52 


1 


.31 


0.46 


3.87 


1 


.55 


1 


.28 


0.49 


3.78 


1 


.50 


1.30 


0.47 


3.87 


1 


.48 


1.23 


0.60 


BC- 


-IV 


-0.49 


2 


.13 


1 


.37 


0.00 


-0.42 


2. 


,23 


1 


.29 


0.01 


-0.55 


2 


.14 


1.35 


0.00 


-0.40 


1 


.96 


1.24 


0.01 


IBC 


-IV 


-0.49 


2 


.13 


1 


.37 


0.00 


-0.41 


2. 


.23 


1 


.29 


0.01 


-0.55 


2 


.14 


1.35 


0.00 


-0.39 


1 


.97 


1.22 


0.02 



RC/FC refers to random/fixed coefficient model. BC/IBC refers to bias corrected/iterated bias corrected estimates. 
Note: i, 000 repetitions. 



